Research Article: Global Solar Radiation Forecasting Using Square Root Regularization-Based Ensemble

Hindawi
Mathematical Problems in Engineering

Volume 2019, Article ID 9620945, 20 pages
https://doi.org/10.1155/2019/9620945
Research Article
Global Solar Radiation Forecasting Using Square Root
Regularization-Based Ensemble
1,2 1,2
Yao Dong and He Jiang
1
School of Statistics, JiangXi University of Finance and Economics, Nanchang JiangXi 330013, China
2
Applied Statistics Research Center, Nanchang Jiangxi 330013, China
Correspondence should be addressed to He Jiang; jiangsky2005@aliyun.com
Received 26 February 2019; Revised 30 April 2019; Accepted 2 May 2019; Published 14 May 2019
Academic Editor: Bogdan Dumitrescu
Copyright © 2019 Yao Dong and He Jiang. This is an open access article distributed under the Creative Commons Attribution
License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly
cited.
In recent decades, the integration of solar energy sources has gradually become the main challenge for global energy consumption.
Therefore, it is essential to predict global solar radiation in an accurate and efficient way when estimating outputs of the solar system.
Inaccurate predictions either cause load overestimation that results in increased cost or failure to gather adequate supplies. However,
accurate forecasting is a challenging task because solar resources are intermittent and uncontrollable. To tackle this difficulty, several
machine learning models have been established; however, the forecasting outcomes of these models are not sufficiently accurate.
Therefore, in this study, we investigate ensemble learning with square root regularization and intelligent optimization to forecast
hourly global solar radiation. The main structure of the proposed method is constructed based on ensemble learning with a random
subspace (RS) method that divides the original data into several covariate subspaces. A novel covariate-selection method called
square root smoothly clipped absolute deviation (SRSCAD) is proposed and is applied to each subspace with efficient extraction
of relevant covariates. To combine the forecasts obtained using RS and SRSCAD, a firefly algorithm (FA) is used to estimate the
weights assigned to individual forecasts. To handle the complexity of the proposed ensemble system, a simple and efficient algorithm
is derived based on a thresholding rule and accelerated gradient method. To illustrate the validity and effectiveness of the proposed
method, global solar radiation datasets of eight locations of Xinjiang province in China are considered. The experimental results
show that the proposed RS-SRSCAD-FA achieves the best performances with a mean absolute percentage error, root-mean-square
error, Theil inequality coefficient, and correlation coefficient of 0.066, 20.21 W/m2 , 0.016, 3.40 s, and 0.98 in site 1, respectively. For
the other seven datasets, RS-SRSCAD-FA still outperforms other approaches. Finally, a nonparametric Friedman test is applied to
perform statistical comparisons of results over eight datasets.
1. Introduction challenges for the incorporation of solar energy sources into

the power grids. In addition, solar radiation can be measured
An increasing number of countries are paying substantial based on sensors including pyrheliometers, pyranometers,
attention to environmental issues such as global warming, and radiometers combined with data-acquisition hardware
climate change, and greenhouse gas emissions. A major cause and software. However, it is time-consuming, cumbersome,
of global warming is the burning of fossil fuels including and expensive to install sensors across the world [5].
coal and oil for supplying traditional electricity generation. To tackle these challenges, it is necessary to develop
This encourages the exploitation and utilization of renewable accurate global solar radiation forecasting models. A wide
energy sources including solar, wind, hydro, tidal, wave, variety of models have been established using machine
biomass, and geothermal sources as alternatives [1]. Never- learning techniques. Machine learning, which is classified
theless, solar energy, as a typical renewable energy source, as an artificial intelligence technology, is a subfield of com-
has stochastic, intermittence, time-varying, and uncertainty puter science. It can be applied to numerous domains, and
properties, which may result in defects in the stability and its characteristic is that the model can identify relations
reliability of power grid systems [2–4]. This introduces new between inputs and outputs notwithstanding whether the
2 Mathematical Problems in Engineering
representation is infeasible. This advantage promotes the (GPR) methodology, which shows promising results with
application of machine learning models in classification, lower computational costs. Rohani et al. [23] presented GPR
pattern recognition, spam filtering, data mining, and fore- with a 𝐾-fold cross-validation (CV) model to forecast daily
casting [6]. At present, a popular method for forecasting and monthly solar radiation at Mashhad city in Iran based on
global solar radiation is to use an artificial neural network several meteorological parameters. Considering the effect of
(ANN). ANNs have been used in various areas with different climatic factors such as sunshine duration, relative humidity,
input covariates (multiple meteorological, geographical, and and air temperature, Guermoui et al. [24] used the GPR
astronomical parameters such as temperature, humidity, to predict the daily global solar radiation in the Saharan
wind speed, precipitation, altitude, declination, and zenith climate. Guermoui et al. [25] proposed a novel method called
angle) and multiple time scales (hourly, daily, and monthly) weighted Gaussian process regression (WGPR) to predict
[7, 8]. The main advantage of ANNs is that they do not daily global and direct horizontal solar radiation in Saharan
require as many adjustable parameters as other classical climate. The experimental results illustrate that these devel-
techniques and provide higher forecasting accuracy. Ozgoren oped models exhibit good forecasting performance.
et al. [9] proposed an ANN model based on a multinonlinear It is challenging for a single forecasting model to provide
regression (MNLE) algorithm to forecast the monthly mean an optimum solution to a complex problem. Each model
daily sum global solar radiation for Turkey. Hejase et al. has advantages and drawbacks. Thus, it is worthwhile to
used a multilayer perceptron and radial basis function (RBF) investigate appropriate combinations of their advantages for
techniques with comprehensive training architectures and implementing the forecasting task. As solar radiation simul-
different combinations of inputs to forecast global horizontal taneously exhibits nonstationary and nonlinear characteris-
irradiance for three major cities in the United Arab Emirates tics at different time scales, the analysis of these fluctuations at
[10]. Renno et al. [11] developed two ANNs to forecast all scales can be performed using multi-scale decomposition
daily global radiation and hourly direct normal irradiance methods. Wang et al. [26] decomposed solar radiation into
for the University of Salerno. Notwithstanding its growing a quasi-stationary process at different frequency scales using
popularity, ANNs have their drawbacks. For example, the wavelet transform (WT) and employed BPNN to predict solar
backpropagation neural network (BPNN) exhibits difficulty radiation based on temperature, clearness index, and phase
in training the input data owing to the iterative tuning of space reconstruction radiation. Monjoly et al. [27] applied
parameters required for hidden neurons because it does three multiscale decomposition methods, empirical mode
not always produce a unique global solution with different decomposition (EMD), ensemble empirical mode decompo-
models; moreover, BPNN exhibits a slow response owing to sition (EEMD), and WT, to analyze the nonstationary and
its gradient-based learning algorithm and requires a longer nonlinear features of the clear sky index, and then developed
training time [12, 13]. a hybrid model AR-NN combined autoregressive process
It is well known that the support vector machine (SVM) (AR) and neural network (NN) to forecast global horizontal
developed by Vapnik [14] is a powerful tool for exploring solar irradiation (GHI). The experimental results showed that
appropriate SVM regressors to establish a prediction model the WT-hybrid model has the best forecasting accuracy. To
with sufficiently high accuracy. The main goal of SVM improve the forecasting performance, Hussain and Alalili
regressors is to construct the optimum prediction model [28] investigated a new hybrid technique based on WT and
that boosts the prediction accuracy efficiently. It is a reg- four different architectures of ANN to forecast GHI in the
ularization network and exhibits an advantage over ANN Abu Dhabi region.
model. To reduce the high computation cost, least-squares Ensemble methods are straightforward and highly effec-
SVM (LSSVM) uses a least-squares method via a set of linear tive techniques that average the approximate predictions of a
equations based on structural risk minimization; therefore, it few weak learners to present accurate estimates rather than
is able to prevent overfitting of the training data and does not find a single remarkable sophisticated learner [29–31]. The
require iterative tuning of model parameters [15–17]. Wu and advent of ensemble methods has gained prominence. The
Liu investigated an SVM model to estimate monthly mean most commonly used ensemble methods include random for-
daily solar radiation using measured minimum, mean, and est, gradient boosting, and bagging. Although these methods
maximum air temperatures in 24 sites in China. Five SVMs have wide applications in other scientific fields, they have only
based on different input covariates were developed [18]. Chen rarely been adopted in forecasting global solar radiation [6].
et al. used seven SVMs and five empirical sunshine-based Gala et al. [32] used support vector regression, random forest
models to forecast daily global solar radiation at three sites in regression, and gradient boosted regression as a hybrid model
Liaoning, China. The experimental results demonstrated that to improve 3-hour accumulated radiation forecasts for seven
all SVMs exhibited higher performance than empirical mod- sites in Spain. Sun et al. [33] proposed different random forest
els [19]. Ekici proposed a LSSVM-based intelligent model to models with and without considering the air pollution index
forecast the daily solar insolation for Turkey [20]. It was noted to estimate solar radiation in three sites in China through
that the temporal behavior of global solar radiation and its comparisons with a few empirical models; their experimental
corresponding atmospheric input covariates are complex and results demonstrated that the performance of random forest
chaotic [21]. Although SVM is flexible in handling nonlinear techniques with the air pollution index is higher. Aler et al.
input covariates, it encounters fluctuations and difficulties [34] applied gradient boosting machine learning to improve
in training input datasets with wide frequencies [22]. Some the performance of diffuse and direct separation models.
researchers have focused on Gaussian process regression Ensembles of linear and nonlinear weak learners have both
Mathematical Problems in Engineering 3
been adopted. The results indicated that the gradient boosting (iii) A simple-to-implement and efficient algorithm is
method can reduce large random errors of separation models designed based on thresholding rules and an acceler-
efficiently. ated gradient method is used to further speed up the
The computational cost is mainly determined by the convergence of the algorithm.
complexity of the forecasting model. It is time-consuming if
several covariates are employed to establish the model. Thus, The rest of this paper is organized as follows. Section 2
it is essential to investigate an efficient method to simplify the presents a preliminary discussion of related methodologies.
model structure. Covariate selection is an efficient method Section 3 shows the proposed method and the design of the
to extract the important covariates. There are several penal- algorithm used in the proposed method is also revealed.
ized covariate selection methods including least absolute The empirical study and corresponding results analysis are
shrinkage and selection operator (LASSO) [35], least angle presented in Section 4. Finally, Section 5 summarizes this
regression [36], elastic net [37], and adaptive LASSO [38]. study and provides concluding remarks. The nomenclature is
LASSO is a popular covariate selection method applying an provided in Table 1.
ℓ1 -type penalty. Yang et al. [39] selected the important time-
lagged covariate using LASSO and revealed its advantages 2. Methodology and Materials
over the exponential smoothing (ES) model and the autore-
gressive integrated moving average (ARIMA) model. LARS 2.1. Forecasting and Covariate Selection in the Interactive
is closely associated with LASSO, and their connection is Model. Several nonlinear models including SVMs and ANNs
established in Efron et al. [36]. Zou proposed adaptive LASSO have been considered to forecast global solar radiation
involving the addition of weights to the penalty parameter because researchers have gradually realized that the out-
of each feature [38]. It outperforms LASSO in terms of comes given by a linear regression model are not accurate.
selection consistency. The elastic net was first proposed by Interactive models are one of nonlinear models and have
Zou, and it considers ℓ2 -type penalty in addition to the ℓ1 - wide applications in many scientific fields. They are favored
type penalty [37]. Thus, the elastic net applied ℓ1 + ℓ2 -type because they take the connections between covariates into
penalty, which provides the following advantages: the ℓ1 part account to capture the nonlinear relationship between the
focuses on feature selection and the ℓ2 part can be applied to global solar radiation and the other meteorological and
boost forecasting accuracy. However, Zhang determined that geographical covariates. In this study, an interactive model
the abovementioned feature selection methods apply convex is considered to build the forecasting model in a global solar
penalties, which makes it difficult to detect the important radiation problem and study the following nonlinear additive
covariates when the model coherence is high [40]. To solve setup:
this problem, the nonconvex penalized covariate selection
approaches are advocated. Knight and Fu proposed bridge 𝑦 = 𝛽0 + ∑𝛽𝑗 𝑥𝑗 + ∑𝜃𝑗𝑘 𝑥𝑗 ⊙ 𝑥𝑘 + 𝜀, (1)
𝑗 𝑗,𝑘
regression, which focuses on the penalty function of the
ℓ𝑟 (0 < 𝑟 < 2) form [41]. Fan and Li [42] proposed where 𝑦 denotes the global solar radiation to be studied, 𝑋 =
the smoothly clipped absolute deviation (SCAD) method [𝑥1 , . . . , 𝑥𝑝 ] ∈ R𝑛×𝑝 is the data matrix with 𝑛 samples and 𝑝
and demonstrated its advantage over LASSO. However, to meteorological and geographical covariates, 𝜀 represents the
the best of the present authors’ knowledge, there is no noise term, which follows a Gaussian distribution 𝑁(0, 𝜎2 𝐼)
research work on feature selection using both square root with 𝜎2 representing the noise level, and let 𝐼 be an identity
loss function and SCAD penalty in the literature. The main matrix. The distribution of 𝜀 can also be a binomial distri-
contribution of this study is the advocation of a novel global bution if a classification problem is studied. Use 𝑥𝑗 ⊙ 𝑥𝑘 to
solar radiation forecasting approach that combines ensemble denote the two-way covariates. Here 𝛽0 is the intercept term,
learning, square root smoothly clipped absolute deviation 𝑝
and {𝛽𝑗 }𝑗=1 and {𝜃𝑗𝑘 }1≤𝑗,𝑘≤𝑝2 are the coefficients of covariates
(SRSCAD), and the firefly algorithm (FA). Specifically, the
to be estimated. Note that several covariates are included in
contributions of this study are given as follows:
this model. Furthermore, the large computation time is one
challenge caused by the excessive number of covariates.
(i) An SRSCAD method is proposed to extract the In the global solar radiation forecasting domain, a proper
important covariates efficiently. Moreover, the opti- metric that measures the fitness of global solar radiation and
mal regularization parameter is selected by a 10-fold other meteorological and geographical features needs to be
CV; this CV completely utilizes the data through defined. This metric is called a loss function, denoted by 𝐿, in
resampling, which is particularly useful for data with the machine learning community. Because 𝜀 follows a Gaus-
a small sample size. sian distribution in the interactive model (1), a simple way to
define this metric is to consider the negative loglikelihood of
(ii) The ensemble learning is the main structure of the
the distribution of 𝑦 (after centering), that is, 𝑓(𝑦 | 𝑋, 𝛽, 𝜃)
proposed method. The diversity created by covariate
selection of the ensemble system substantially aids in 𝐿 fl log (𝑓 (𝑦 | 𝑋, 𝛽, 𝜃))
boosting the forecasting accuracy. The FA, which is
demonstrated to obtain the global optimal solution 󵄩󵄩 󵄩󵄩2
1 󵄩󵄩 󵄩 (2)
successfully, is applied as a weight policy in the = 2 󵄩󵄩𝑦 − ∑𝛽 𝑥 − ∑𝜃 𝑥 ⊙ 𝑥 󵄩󵄩󵄩 ,
󵄩󵄩 𝑗 𝑗 𝑗𝑘 𝑗 𝑘󵄩 󵄩󵄩
forecasting model. 2𝜎 󵄩󵄩 󵄩󵄩2
󵄩 𝑗 𝑗,𝑘
Table 1: Nomenclature. covariates. Therefore, how to extract the important covariates

Abbreviations that make great contributions to the model is worth studying.
AGM Accelerated gradient method This problem is well known as covariate selection. Covariate
AIC Akaike information criterion selection is essential because the dimensions are high in the
BPNN Back propagation neural network interactive model. LASSO is a famous covariate selection
CC (seconds) Computation cost method that minimizes the square error loss function and ℓ1 -
CS Cuckoo search type penalty to enforce sparsity into the model. It has been
CV Cross-validation shown to be accurate and can be seen as a Bayesian covariate
ENN Elman neural network selection using a Laplace prior distribution [43, 44]. However,
FA Firefly algorithm LASSO often selects nuisance variables when the model
GCV Generalized cross-validation coherence is high. Namely, if two covariates play a similar
LASSO Least absolute shrinkage and selection operator role in global solar radiation forecasting, it is not easy to
and selection operator distinguish between them. To overcome the disadvantages of
LOOCV Leave-one-out cross-validation LASSO, a nonconvex penalty SCAD is proposed and is shown
LSSVM Least-squares support vector machine to have better statistical properties (unbiased, continuity, and
MAPE Mean absolute percentage error sparsity) than LASSO [42]. SCAD considers the following
PSO Particle swarm optimization optimization:
QRBF Quadratic radial basis function
RS Random subspace
{ }
RMSE (W/m2 ) Root-mean-square error min {𝐿 + ∑PSCAD (𝛽𝑗 ; 𝜆) + ∑PSCAD (𝜃𝑗𝑘 ; 𝜆)} (3)
SCAD Smoothly clipped absolute deviation 𝛽𝑗 ,𝜃𝑗𝑘
𝑗 𝑗,𝑘
{ }
SRPQVSP Square root progressive quantile
variable screening procedure with the SCAD penalty expressed by
SVM Support vector machine
|𝑡|
SRL Square root LASSO 𝑎𝜆 − 𝑧
PSCAD (𝑡; 𝜆) = ∫ {𝜆1{𝑧≤𝜆} + 1 } 𝑑𝑧 (4)
SRSCAD Square root smoothly clipped
0 𝑎 − 1 {𝑧>𝜆}
absolute deviation
TIC Theil inequality coefficient for 𝑎 = 3.7, which is determined using generalized cross-
Notation validation.
X Design matrix To further increase the forecasting accuracy and facili-
y Response variable tate the parameter tuning work, square root regularization
𝛽0 Intercept term approaches are used in this paper. Square root regularization
𝛽𝑗 Coefficient for main effects has attracted the interest of researchers in the statistics and
𝜃𝑗𝑘 Coefficients for covariates machine learning community for years because it makes
𝜀 Error term the parameter tuning work easier. Specifically, in LASSO or
S Threshold rule SCAD problems, the tuning parameter 𝜆 depends on the
𝜂 Grid values value of the noise level 𝜎, i.e., 𝜆 ∼ O(𝜎√2(log 𝑝)). However,
𝐺 Cumulative distribution function the noise level 𝜎 is difficult to estimate. To avoid this problem,
of Gaussian distribution Belloni et al. (2009) proposed square root LASSO (SRL),
𝑁 Population size
which applied a square root loss function √𝐿 and considered
𝛾 Light absorption coefficient
the following optimization problem:
𝜆 Regularization parameter
𝜏 Step size
𝑀 { 󵄨 󵄨 󵄨 󵄨 }
Maximum number of iterations
min {√𝐿 + 𝜆 (∑ 󵄨󵄨󵄨󵄨𝛽𝑗 󵄨󵄨󵄨󵄨 + ∑ 󵄨󵄨󵄨󵄨𝜃𝑗𝑘 󵄨󵄨󵄨󵄨)} . (5)
𝐾 Number of folds in CV 𝛽𝑗 ,𝜃𝑗𝑘
𝑗 𝑗,𝑘
ℓ Location latitude { }
𝐷𝑛 (days) Number of days in the year Different than LASSO, the regularization parameter 𝜆 in SRL
𝑆𝑑 (hours) Possible maximum monthly mean
depends on √2(log 𝑝), which is independent of the noise
of daily sunshine duration
level 𝜎. As is known, parameter tuning is highly significant
𝑁ℎ Number of hidden neurons
in determining the forecasting accuracy. Therefore, square
𝐵 Number of blocks generated by RS
root regularization may obtain accurate outcomes. To this
𝑏 Number of features in each block
𝑔0 Attractiveness
end, this paper studies a novel covariate selection method that
𝑟 Cartesian distance borrows strengths from both the square root loss function √ 𝐿
and the SCAD penalty, which is given as follows:
{ }
where ‖ ⋅ ‖2 represents the Euclidean norm. A common min {√𝐿 + ∑PSCAD (𝛽𝑗 ; 𝜆) + ∑PSCAD (𝜃𝑗𝑘 ; 𝜆)} . (6)
procedure is to restrict the number of covariates to avoid the 𝛽𝑗 ,𝜃𝑗𝑘
𝑗 𝑗,𝑘
{ }
overfitting problem. Namely, a few of the covariates may not
be useful to the model and they are considered to be nuisance In the following, we focus on (6).
2.2. Ensemble Learning. Famous statistician George E.P. Box for 𝑑 = 1, . . . , 𝐷, where ‖ ⋅ ‖2 denotes the Euclidean norm,
said “all the models are wrong but some of them are useful” 𝐷 is the number of dimensions, 𝑟𝑖𝑗 is the Cartesian distance
[45]. This indicates that a single forecasting model cannot between 𝐹𝑖 and 𝐹𝑗 , and 𝑓𝑖𝑑 and 𝑓𝑗𝑑 are the 𝑑th components
always perform very well under all circumstances. The basic of 𝐹𝑖 and 𝐹𝑗 , respectively. At the distance 𝑟 = 0, the
idea of ensemble learning is to create a finite number of parameter 𝑔0 represents the attractiveness and 𝛾 denotes the
individual learners and combine them through assigning light absorption coefficient. According to [52], 𝛾 is used as a
different weights and summing them up [46]. Ensemble typical initial value, 𝛾 = 1/Γ2 , where Γ is the length scale of
learning is a supervised learning algorithm because it can the optimization problem.
be used to make predictions and classifications. A single The movement of the firefly 𝐹𝑖 towards another brighter
model that is going to be trained can be seen as one firefly 𝐹𝑗 can be given by
hypothesis. However, it is difficult to decide which hypothesis
2
is the best. Thus, the ensemble learning system combines 𝑓𝑖𝑑 (𝑡 + 1) = 𝑓𝑖𝑑 (𝑡) + 𝑔0 𝑒−𝛾𝑟𝑖𝑗 [𝑓𝑗𝑑 (𝑡) − 𝑓𝑖𝑑 (𝑡)]
all the hypotheses and creates a better hypothesis although
(9)
the computational time is increased. Therefore, ensemble 1
learning can represent more flexible functions than a single + 𝛼 sgn (𝑟𝑎𝑛𝑑 − ) ⊕ Lévy,
2
model. Note that an overfitting problem can be caused by
the ensemble techniques. In practice, regularization methods Lévy : 𝑢 = 𝑡−𝑎 (1 < 𝑎 ≤ 3) , (10)
are applied to reduce the problem related to overfitting of the
training data. where 𝛼 ∈ [0, 1] is the step size of the random walk and
Empirically, ensemble learning tends to provide better sgn(𝑟𝑎𝑛𝑑 − 1/2) (𝑟𝑎𝑛𝑑 ∈ [0, 1]) is a random sign whereas the
results than a single model using the significant diversity random step length is obtained from a Lévy distribution in
among the models. Therefore, many ensemble methods seek (10). In this paper, the minimization problems are considered.
to promote diversity. Specifically, the diversity can be gener- Let 𝑉 be the fitness function. If 𝑉(𝐹𝑖 ) < 𝑉(𝐹𝐽 ), this means that
ated from the sample space (boosting and bagging), covariate 𝐹𝑖 is brighter than the firefly 𝐹𝑗 . The main steps of the FA are
space (RS [47] and random forest [48]), and parameter space. described as follows:
To gain better performance, the method used to combine (i) Step 1. Initialize the positions of fireflies (or 𝑁
single models is very important. The simplest way is just to solutions) 𝐹𝑖 = [𝑓𝑖1 , 𝑓𝑖2 , . . . , 𝑓𝑖𝑑 ] randomly based on
calculate the average of all the results given by the single the following:
models. However, these single models may play different roles
in the ensemble system. Thus, linear and nonlinear weighting 𝑓𝑖𝑑 = 𝑙𝑜𝑤 + 𝑟𝑎𝑛𝑑 (0, 1) (𝑢𝑝 − 𝑙𝑜𝑤)
strategies are applied to evaluate their contributions. In this (11)
paper, the FA, which is introduced in the following section, is (𝑖 = 1, 2, . . . , 𝑁, 𝑑 = 1, 2, . . . , 𝐷)
used as a weighting strategy for single models.
where 𝑙𝑜𝑤 and 𝑢𝑝 denote the lower and upper bounds
2.3. Firefly Algorithm. Xin-She Yang proposed a nature- for the dimension 𝐷, respectively. The fitness function
inspired metaheuristic in 2008 called FA, which was derived can be used to evaluate the position of these fireflies
from the flashing behavior of fireflies [49]. The main purpose in the initial population.
of the FA is to use a signal flash to attract other fireflies. There (ii) Step 2. Compare the positions of 𝐹𝑖 and 𝐹𝑗 (𝑖 ≠ 𝑗); if
are three hypotheses in this algorithm: (i) no consideration of 𝑉(𝐹𝑖 ) < 𝑉(𝐹𝑗 ), 𝐹𝑖 will be attracted by 𝐹𝑗 and updates
their sex, each firefly will move towards all other fireflies; (ii) its position based on (9). Then the updated position
attractiveness is associated with brightness, the fireflies with of each firefly can also be evaluated.
low light intensities will be attracted by the brighter ones in
the vicinity; nevertheless, the brightness will increase as their (iii) Step 3. If the iteration process satisfies the stopping
distance decreases; (iii) if no fireflies are brighter than the criteria, the algorithm stops; otherwise, continue with
given one, it will move randomly [50, 51]. step 2.
Let 𝐹𝑖 be the 𝑖th firefly in the population, where each
firefly represents a candidate solution in the search space and 3. The RS-SRSCAD-FA Method
𝑁 is the population size. Fireflies move towards the optimal
solution positions. The attractiveness is related to the light In this paper, a novel global solar radiation forecasting
intensity; thus we can define the attractiveness between two method is invented based on RS, SRSCAD, and the FA and
fireflies 𝐹𝑖 and 𝐹𝑗 as follows: is called RS-SRSCAD-FA for short. The main procedure of
RS-SRSCAD-FA is described as follows:
2
𝑔 (𝑟𝑖𝑗 ) = 𝑔0 𝑒−𝛾𝑟𝑖𝑗 , (7) (i) Step 1: Build the interactive model as described in (1).
and (ii) Step 2: The RS method is applied to the interactive
model. In particular, randomly select 𝑏 covariates
𝐷 without replacement for 𝐵 times, which results in 𝐵
󵄩 󵄩 2
𝑟𝑖𝑗 = 󵄩󵄩󵄩󵄩𝐹𝑖 − 𝐹𝑗 󵄩󵄩󵄩󵄩2 = √ ∑ (𝑓𝑖𝑑 − 𝑓𝑗𝑑 ) , (8) blocks denoted by {𝐺1 , . . . , 𝐺𝐵 } with 𝐿 covariates in
𝑑=1 each block.
Figure 1: Flowchart of the RS-SRSCAD-FA model.
(iii) Step 3: SRSCAD covariate selection method is used a few covariates are included; (2) the collinearity between
in each block {𝐺𝑖 }𝐵𝑖=1 . covariates is decreased. In step 4, the SRSCAD estimates
(iv) Step 4: Assign weights to 𝐵 blocks and concatenate obtained in step 3 are concatenated using the FA instead of
the estimates using the weights determined by FA. linear combination. This procedure aggregates the diversity
generated by steps 2 and 3. The flowchart of the proposed
method is shown in Figure 1.
Remark. In step 1, an interactive model is established so that
all the two-way covariates 𝑥𝑗 ⊙ 𝑥𝑘 that describe the possible To solve the penalized optimization problems efficiently,
relationships between covariates 𝑥𝑗 and 𝑥𝑘 are considered. the thresholding rule (generally referred to as the S function
Note that the dimension of the model has increased from 𝑝 [53]), rather than the penalty function, is applied as the main
to 𝑝2 + 𝑝. In step 2, similar to random forest, the interactive tool throughout this study. The threshold function is defined
model is divided into a number of blocks using the RS rigorously as follows.
method, which randomly extracts a number of covariates.
This procedure is the same as the divide procedure in the Definition (threshold rule). A threshold rule is a real-valued
“divide and conquer” strategy of the ensemble learning. function S(𝑡; 𝜆) defined for −∞ < 𝑡 < ∞ and 0 ≤ 𝜆 < ∞
The complex model is divided into some simple models such that
that are easy to handle separately. After this procedure,
all the solutions are combined. In step 3, the proposed (1) S(−𝑡; 𝜆) = −S(𝑡; 𝜆);
dimension reduction method SRSCAD is used to reduce the (2) S(𝑡; 𝜆) ≤ S(𝑡󸀠 ; 𝜆) for 𝑡 ≤ 𝑡󸀠 ;
number of covariates. This step is essential because it can
reduce the number of covariates effectively. This delivers two (3) lim𝑡󳨀→+∞ S(𝑡; 𝜆) = ∞; and
benefits: (1) the computation issue is solved because only (4) 0 ≤ S(𝑡; 𝜆) ≤ 𝑡 for 0 ≤ 𝑡 ≤ ∞.
Inputs: 𝑍 = (𝑋, 𝑦) = {(̃ 𝑥1 , 𝑦1 ), ⋅ ⋅ ⋅ , (̃

𝑥𝑁 , 𝑦𝑁 )}: 𝑥̃ 𝑘 ∈ R𝑝 , the dataset
𝑀: the number of blocks generated by Random Splitting
𝑚: maximum number of iterations in the SRSCAD algorithm
𝑡𝑜𝑙: the error tolerance used in the SRSCAD algorithm
𝐾: number of folds in cross-validation
Output: The ensemble test error
1: Randomly divide the original data 𝑍 into the training dataset T and test dataset F.
2: Establish interactive models QT and QF, respectively
3: Randomly divide TIT into 𝐵 blocks {D𝑖 }𝐵𝑖=1 with 𝑏 features in each model
4: Initialize FA parameters including the maximum number of iterations 𝐼𝑡𝑒𝑟𝑚𝑎𝑥
5: for 𝑖 = 1 to 𝐵 do
6: Divide the training data D𝑖 into 𝐾 folds
7: for 𝑗 = 1 to 𝐾
8: Use the 𝑗th fold as subset data D𝑡𝑟𝑎𝑖𝑛 𝑖,𝑗 = (𝑋𝑡𝑟𝑎𝑖𝑛 𝑡𝑟𝑎𝑖𝑛
𝑖,𝑗 , 𝑦𝑖,𝑗 ) and the remaining folds are regarded
𝑡𝑒𝑠𝑡
as subset data D𝑖,𝑗
9: Generate grid values of 𝜆𝐺 = {𝜆 𝑢 }𝑠𝑢=1
10: for 𝑢 = 1 to 𝑠
11: Initialization: 𝑡 ←󳨀 0, Ω(𝑡) ←󳨀 0
12: Scaling: 𝑋𝑡𝑟𝑎𝑖𝑛
𝑖,𝑗 ←󳨀 𝑋𝑡𝑟𝑎𝑖𝑛𝑖,𝑗 /𝜏, 𝑦𝑖,𝑗
𝑡𝑟𝑎𝑖𝑛
←󳨀 𝑦𝑡𝑟𝑎𝑖𝑛
𝑖,𝑗 /𝜏
(𝑡+1) (𝑡)
13: while ‖Ω − Ω ‖ < 𝑡𝑜𝑙 or 𝑗 > 𝑚 do
14: Step 1. 𝜉(𝑡) ←󳨀 Ω(𝑡) + ((𝜔(𝑡−1) − 1)/𝜔(𝑡) )(Ω(𝑡) − Ω(𝑡−1) )
15: Step 2. 𝜁(𝑡) ←󳨀 𝜉(𝑡) + 𝑋𝑡𝑟𝑎𝑖𝑛Τ 𝑖,𝑗 (𝑦𝑡𝑟𝑎𝑖𝑛
𝑖,𝑗 − 𝑋𝑡𝑟𝑎𝑖𝑛
𝑖,𝑗 𝜉 )
(𝑡)
(𝑡+1) (𝑡)
16: Step 3. Ω ←󳨀 SSCAD (𝜁 ; 𝜆 𝑢 ‖𝑦 − 𝑋Ω ‖2 /𝜏22 ) (𝑡)
17: Step 4. 𝜔 (𝑡+1)

= (1 + √1 + 4𝜔(𝑡)2 )/2
18: end while
19: end for
20: Obtain the solution path B = {𝑏𝑢 }𝑠𝑢=1 and the corresponding sparsity patterns G = {𝑔𝑢 }𝑠𝑢=1 .
21: end for
22: Calculate CV errors using B and G. Determine the optimal tuning parameters 𝜆 𝑜𝑝𝑡 w.r.t.
obtain the optimal SRSCAD estimator Ω ̂ SRSCAD−D
𝑖
23: endfor
24: Concatenate the optimal SRSCAD estimates {Ω ̂ SRSCAD−D }𝐾 using w and
𝑖 𝑖=1
calculate the fitness using the fitness function
25: Obtain the optimal weight vector by the FA algorithm, and concatenate the optimal
SRSCAD estimates using w𝑜𝑝𝑡
26: Calculate the forecasting error using the test dataset QF
27: Output the ensemble test error
Algorithm 1: The RS-SRSCAD-FA algorithm.
From the definition, it is evident that S(⋅; 𝜆) is an odd details of the proposed algorithm are given in Algorithm 1
monotone unbounded shrinkage rule for 𝑡, at any 𝜆. The after initialization with 𝜔(−1) = 0, 𝜔(0) = 1 and Ω(0) = Ω(−1) .
LASSO and SCAD rules are defined as follows: When designing the algorithm, the following points should
be noted:
SSOFT (𝑡; 𝜆) = sgn (𝑡) (|𝑡| − 𝜆)+ ; (12)
(i) Data splitting scheme. A total of 75% of the original
{𝜆 if 𝑡 ≤ 𝜆 data are going to be applied as training data and the
{
{
{ (3.7𝜆 − 𝑡) remaining part (25%) is considered to be the test
SSCAD (𝑡; 𝜆) = { if 𝜆 ≤ 𝑡 ≤ 𝑎𝜆; (13)
{
{ (2.7) data. The training data is applied to train a forecasting
{ model and the test data is used to evaluate the model
{0 if 𝑡 ≥ 𝑎𝜆
performances.
and sgn is the sign function. The vector versions of (ii) Convergence.
SSOFT (𝑡; 𝜆) and SSCAD (𝑡; 𝜆) are denoted by SSOFT (𝑡󸀠 ; 𝜆) and
SSCAD (𝑡󸀠 ; 𝜆) for any vector 𝑡󸀠 . Step size. The step size 𝜏 is used to ensure
To describe the proposed RS-SRSCAD-FA algorithm in a the convergence of the SRSCAD algorithm (cf.
simple way, we first define a matrix Ω with its 𝑗th column lines 11–19 in Algorithm 1). Usually, it can be
given by Ω𝑗 = [𝛽𝑗 , 𝜃𝑗1 , . . . , 𝜃𝑗𝑝 ], which indicates the 𝑗th determined based on a theoretical analysis with
covariate and all of its associated two-way covariates. The the surrogate function defined. Another way to
determine the step size is to apply a line search that it is appropriate for large-scale solar thermal power
method. industry installations. Therefore, it is worthwhile to inves-
Stopping criteria. The error tolerance between tigate the global solar radiation by conducting research
two successive iterates and maximum iteration in eight sites in this region; see Figure 2. This dataset is
number are determined by trial and error. collected from the National Renewable Energy Laboratory
In particular, the error tolerance is chosen (NREL) and is available at the following website http://www
from the set {1𝑒 − 4, 1𝑒 − 5, 1𝑒 − 6} and .nrel.gov/gis/solar.html. In addition to the global solar radi-
the maximum number of iterations is selected ation (W/m2 ), this data consists of seven meteorological
from {100, 500, 1000}. The optimal combination covariates, which are the solar zenith angle (degrees), pre-
of the error tolerance and maximum itera- cipitation (cm), temperature (∘ C), wind direction (degrees),
tion number is determined using the trade- wind speed (m/s), relative humidity (%), and air pressure
off between forecasting accuracy and computa- (mbar). It was collected between 11:00 and 20:00 from 1/1/2014
tional efficiency. to 12/31/2014 as the sunshine is poor during other periods. As
an illustrative example, the global solar radiation signal and
(iii) Acceleration. Accelerated gradient method (AGM) other meteorological covariates from 1/1/2014 to 1/31/2014 in
is applied to increase the convergence speed. The Site 1 are shown in Figure 3. The main purpose of this paper
design of AGM was first proposed by [54] and is to forecast the hourly global solar radiation using these
the computation complexity can be reduced from meteorological covariates accurately and efficiently, which
O(1/𝑀) to O(1/𝑀2 ), where 𝑀 represents the number could play a dominant role in the design of solar power plants.
of iterations of the algorithm.
4.2. Individual Models. ENN and LSSVM are implemented
(iv) Parameter tuning. In a regularization-based problem, using the MATLAB neural network toolbox and LSSVMlab
the choice of regularization parameter is critical. In 1.5 toolbox. Specifically, a three-layer ENN is established
this work, the regularization parameters are selected with the number of input neurons 𝑁𝑖 = 56, which is the
based on 10-fold CV. The main procedure can be number of covariates in the interactive model. The optimal
described as follows: the data is divided into 𝐾 parts number of hidden neurons is 5, which is selected from the set
with roughly equal sizes. The 𝐾 − 1 parts will be used 𝑁ℎ = {5, 10, 15, 20} using 10-fold CV. The number of output
to train the model and the left one part is applied to neurons 𝑁𝑜 = 1. Thus, ENN uses a 56 × 5 × 1 network
calculate the test error. CV repeats this procedure for architecture. Furthermore, to estimate the weights between
𝐾 times for each candidate regularization parameter layers more accurately, weight regularization with ℓ2 penalty
and the final CV error is the average of 𝐾 test function is applied during the weight estimation procedure.
errors. The optimal parameter is selected based on the The ℓ2 regularization parameter is selected from a grid of
smallest CV error. When 𝐾 = 𝑛, this procedure is values denoted by 𝜂 = {1𝑒 − 5, 1𝑒 − 4, 1𝑒 − 3, 1𝑒 − 2, 1𝑒 − 1}.
known as leave-one-out cross-validation (LOOCV), The optimal regularization parameter 𝜂𝑜 is determined as
which is often used in complex problems. 1𝑒 − 5 using the Akaike information criterion (AIC) [59].
LSSVM is established by applying an RBF with two kernel
4. Forecasting Global Solar Radiation in parameters selected from the interval [2−6 , 2−5 , . . . , 25 , 26 ]
Xinjiang Province of China using 10-fold CV. SRL is implemented based on the algorithm
COORD, the MATLAB code of which may be downloaded
In this section, to demonstrate the forecasting accuracy of from Belloni’s website. Theoretical selection is applied for
the proposed RS-SRSCAD-FA algorithm, comparisons are the selection of regularization parameter 𝜆 in COORD: 𝜆 =
made between this and other forecasting methods includ- 1.1𝐺−1 (1 − 0.05/(2(𝑝2 + 𝑝))) in the optimization problem (5)
ing two benchmark models, the Elman neural network recommended in [60], with 𝐺 representing the cumulative
(ENN) and LSSVM, the Angstrom–Prescott empirical model distribution function of a Gaussian distribution. For the
(Angstrom–Prescott) [55], popular feature selection models Angstrom–Prescott empirical model, we apply the empirical
including LASSO, SCAD, and SRL and existing hybrid model in the form of a second polynomial, which has been
models including the combination of cuckoo search and proved to be superior to other forms of model by [55].
square root progressive quantile variable selection procedure The possible maximum monthly mean of the daily sunshine
in sparse quadratic radial basis function (CS-SRPQVSP- duration is given by 𝑆𝑑 = (2/15)cos−1 (− tan ℓ tan 𝛿) with ℓ
QRBF) [56], particle swarm optimization in a backprop- representing the location latitude and 𝛿 = 23.45 sin(360(284+
agation neural network (PSO-BPNN) [57], and the firefly 𝐷𝑛 )/365), where 𝐷𝑛 is the number of days in the year. CS-
algorithm in a support vector machine (SVM-FA) [58]. The SRPQVSP-QBF, PSO-BPNN, and SVM-FA are implemented
experimental results are evaluated using forecasting accuracy using MATLAB and all the model parameters are tuned to
and computation efficiency. Furthermore, the analysis of give good performances. The proposed RS-SRSCAD-FA is
multihour-ahead cases including 24 h ahead and 48 h ahead implemented based on Algorithm 1.
is also provided in the experiments.
4.3. Statistical Measures of Forecasting Performance. In this
4.1. Data Collection and Description. The Xinjiang area is paper, four common criteria, the mean absolute percent-
rich in sunshine and far from the population centers so age error (MAPE), root-mean-square error (RMSE), Theil
Site 5 Site 7
37.55∘ N, 83.15∘ E 37.55∘ N, 84.15∘ E Annual total global horizontal radiation (kWh/Ｇ2 )
<1050
1050-1400
1400-1750
>1750
Site 2 Xinjiang
37.05∘ N, 81.65∘ E
Site 1
36.85∘ N, 80.55∘ E
Site 3
36.95∘ N, 81.75∘ E
Site 4 Site 6 Site 8

37.35∘ N, 82.95∘ E 37.35∘ N, 83.75∘ E 36.85∘ N, 88.75∘ E
Site 1 Site 2 Site 3 Site 4 Site 5 Site 6 Site 7 Site 8
Mean global solar radiation (W/Ｇ2 ) 560.213 556.648 557.266 556.207 555.014 558.547 559.223 666.151
2
Maximum global solar radiation (W/Ｇ ) 956 960 959 963 964 960 969 1101
2
Minimum global solar radiation (W/Ｇ ) 17 9 9 11 12 16 18 25
Figure 2: Details of the eight sites in the Xinjiang area for measuring global solar radiation.
inequality coefficient (TIC), and correlation coefficient (𝑅), (in seconds) is denoted by CC, and the experiments are
are used to evaluate the prediction accuracy. The definitions implemented using a PC with an Intel Core i7 CPU at 3.60
of the criteria are as follows: GHz.
1 𝑛 󵄨󵄨󵄨 𝑦 − 𝑦̂𝑖 󵄨󵄨󵄨󵄨
MAPE = ∑ 󵄨󵄨󵄨 𝑖 󵄨, (14)
𝑛 𝑖=1 󵄨󵄨 𝑦𝑖 󵄨󵄨󵄨 4.4. Results Analysis. Table 2 summarizes the forecasting
results and computation cost of all the compared approaches
at Sites 1–8. As the median is more robust to the outliers
1 𝑛 2 than the mean, we focus on the median when making com-
RMSE = √ ∑ (𝑦𝑖 − 𝑦̂𝑖 ) , (15)
𝑛 𝑖=1 parisons. Apparently, the performance of the proposed RS-
SRSCAD-FA is remarkable because it provides the lowest
√(1/𝑛) ∑𝑛𝑖=1 (𝑦𝑖 − 𝑦̂𝑖 )2 RMSE and TIC values, whereas its MAPEs are marginally
TIC = , (16) higher than those of other forecasting methods. For instance,
√(1/𝑛) ∑𝑛𝑖=1 𝑦̂𝑖2 + √(1/𝑛) ∑𝑛𝑖=1 𝑦𝑖2 RMSEs of RS-SRSCAD-FA are 20.21, 21.93, 21.20, and
20.29 W/m2 at sites 1–4, which are approximately 20.37%,
∑𝑛𝑖=1 (𝑦𝑖 − 𝑦) (𝑦̂𝑖 − 𝑦)
̂ 13.22%, 18.99%, and 22.65% higher than these of ENN. The
R= 2
, (17) Angstrom–Prescott model performs better than both CS-
2
∑𝑛𝑖=1 (𝑦𝑖 − 𝑦) ∑𝑛𝑖=1 (𝑦̂𝑖 − 𝑦)
̂ SRPQVSP-QRBF and SVM-FA in terms of accuracy, except
for Site 8. For instance, RMSEs of the Angstrom–Prescott
where 𝑛 is the sample size of the test data, 𝑦𝑖 and 𝑦̂𝑖 represent model are given as 23.13, 24.13, 24.63, 25.14, 24.74, 22.32,
the true value and estimated value, respectively, and 𝑦 and 𝑦̂ and 22.22 W/m2 , which are lower than those of CS-
denote the mean value of 𝑦 and 𝑦 ̂ . The data is divided into the SRPQVSP-RQBF at 25.78, 25.16, 25.62, 22.34, 27.41, 22.65, and
training dataset (75%) and test dataset (25%). Specifically, the 22.89 W/m2 . CS-SRPQVSP-QRBF delivers better results than
training data is used to establish the forecasting models, and SVM-FA except for Sites 1 and 2. Both CS-SRPQVSP-QRBF
the out-of-sample forecasting performances are evaluated and SVM-FA outperform PSO-BPNN in terms of MAPE,
based on the test data. The division procedure is repeated RMSE, and TIC. The reasons for this are probably that CS and
30 times, and the median of the MAPEs, RMSEs, TICs, FA are superior over PSO in parameter estimation. However,
and 𝑅 values is reported. Furthermore, the convergence cost PSO-BPNN still produces better outcomes than LSSVM,
800 20
700
15
solar radiation (W/m2 )

600
Temperature (C)
10
Actual global
500
400 5
300
0
200
−5
100
0 −10
0 50 100 150 200 250 300 0 50 100 150 200 250 300
855 80
70
850
Relative humidity (%)

60
Pressure (mbar)
845 50
40
840 30
20
835
10
830 0
0 50 100 150 200 250 300 0 50 100 150 200 250 300
100 0.5
90
0.4
Precipitation (cm)
angle (degree)
Solar zenith
80
0.3
70
0.2
60
50 0.1
0 50 100 150 200 250 300 0 50 100 150 200 250 300
400 6
5
Wind direction (degree)
300
Wind speed (m/s)
200 3
2
100
1
0 0
0 50 100 150 200 250 300 0 50 100 150 200 250 300
Figure 3: Example of a global solar radiation signal and other meteorological covariates including solar zenith angle (degrees), precipitation
(cm), temperature (∘ C), wind direction (degrees), wind speed (m/s), relative humidity (%), and air pressure (mbar).
LASSO, SCAD, and SRL, which demonstrates that the hybrid model is more computationally efficient than CS-SRPQVSP-
forecasting methods have proved the model performance to QRBF, PSO-BPNN, and SVM-FA, which take more train-
some extent. ing time. For instance, in Site 4, the computational time
LASSO, SCAD, and SRL provide comparable forecast- of the Angstrom–Prescott model is 4.06 s, and those of
ing results and training times that are much shorter than CS-SRPQVSP-QRBF, PSO-BPNN, and SVM-FA are 415.35,
other methods. Although LSSVM is computationally rapid, 502.35, and 408.56 s, respectively. The reason why CS-
its performance is not as good as ENN and provides the SRPQVSP-QRBF takes more computation time is that it
largest forecasting errors at Sites 2–7. The Angstrom–Prescott considers the two-way interaction terms and the selection
Table 2: Statistical performances of the proposed RS-SRSCAD-FA and other competitors. The statistical performances are evaluated by
mean absolute percentage error (MAPE), root-mean-square error (RMSE, W/m2 ), Theil inequality coefficient (TIC), computation cost (CC,
in seconds), and correlation coefficient (𝑅) at Sites 1–8.
MAPE RMSE TIC CC 𝑅 MAPE RMSE TIC CC 𝑅

Sites 1 and 2
ENN 0.062 25.38 0.021 4.46 0.82 0.083 25.27 0.021 2.84 0.84
LSSVM 0.086 21.67 0.018 0.08 0.97 0.175 30.95 0.026 0.06 0.82
LASSO 0.057 28.84 0.024 1.25 0.76 0.078 28.06 0.023 0.80 0.79
SCAD 0.073 29.29 0.024 0.93 0.75 0.077 28.47 0.024 0.95 0.78
SRL 0.073 29.30 0.024 0.61 0.83 0.077 28.49 0.024 0.06 0.85
Angstrom–Prescott 0.078 23.13 0.019 2.40 0.94 0.087 24.13 0.020 4.08 0.93
CS-SRPQVSP-QRBF 0.073 25.78 0.018 585.03 0.81 0.095 25.16 0.021 425.13 0.91
PSO-BPNN 0.082 28.16 0.223 438.12 0.80 0.092 26.65 0.022 473.15 0.88
SVM-FA 0.084 24.13 0.020 312.56 0.88 0.091 26.34 0.022 387.13 0.88
RS-SRSCAD-FA 0.066 20.21 0.016 3.40 0.98 0.10 21.93 0.018 3.06 0.96
Sites 3 and 4
ENN 0.082 26.17 0.022 2.64 0.85 0.079 26.23 0.022 2.87 0.83
LSSVM 0.186 31.47 0.026 0.06 0.82 0.153 32.07 0.027 0.07 0.72
LASSO 0.080 29.01 0.024 0.78 0.86 0.078 29.44 0.025 0.84 0.76
SCAD 0.081 29.42 0.024 0.90 0.88 0.079 29.91 0.025 0.90 0.79
SRL 0.082 29.75 0.025 0.56 0.87 0.079 29.79 0.025 0.61 0.78
CS-SRPQVSP-QRBF 0.092 25.62 0.021 321.45 0.90 0.078 22.34 0.018 415.35 0.89
PSO-BPNN 0.100 27.65 0.023 501.18 0.87 0.110 28.35 0.023 502.23 0.79
SVM-FA 0.093 26.85 0.022 392.15 0.89 0.097 27.05 0.022 408.56 0.82
RS-SRSCAD-FA 0.091 21.20 0.017 3.29 0.92 0.078 20.29 0.017 2.96 0.94
Sites 5 and 6
ENN 0.077 25.83 0.021 4.03 0.89 0.063 25.67 0.021 2.68 0.85
LSSVM 0.148 31.76 0.027 0.06 0.79 0.071 28.52 0.024 0.78 0.82
LASSO 0.079 29.03 0.024 0.79 0.80 0.069 29.03 0.024 0.91 0.79
SCAD 0.078 29.30 0.024 0.87 0.80 0.069 29.03 0.024 0.91 0.79
SRL 0.076 29.35 0.024 0.56 0.82 0.072 28.92 0.024 0.60 0.83
CS-SRPQVSP-QRBF 0.085 27.41 0.022 450.56 0.89 0.067 22.65 0.018 403.12 0.89
PSO-BPNN 0.096 28.16 0.024 492.56 0.87 0.090 27.96 0.023 592.87 0.86
SVM-FA 0.094 26.97 0.022 413.95 0.88 0.089 27.35 0.023 459.78 0.86
RS-SRSCAD-FA 0.075 20.99 0.017 3.02 0.94 0.066 20.74 0.017 3.15 0.91
Sites 7 and 8
ENN 0.059 25.03 0.021 2.69 0.88 0.042 27.07 0.019 2.69 0.79
LSSVM 0.119 31.41 0.026 0.06 0.74 0.059 32.05 0.023 0.06 0.72
LASSO 0.069 27.79 0.023 0.80 0.71 0.060 33.96 0.024 0.79 0.75
SCAD 0.067 28.33 0.023 0.89 0.68 0.050 30.77 0.022 0.91 0.82
SRL 0.069 28.17 0.023 0.56 0.75 0.052 31.57 0.022 0.58 0.83
CS-SRPQVSP-QRBF 0.065 22.89 0.018 357.15 0.91 0.033 21.32 0.015 442.18 0.90
PSO-BPNN 0.083 26.16 0.022 398.92 0.85 0.082 27.96 0.020 402.15 0.86
SVM-FA 0.083 25.94 0.021 456.13 0.87 0.079 25.74 0.018 502.89 0.89
RS-SRSCAD-FA 0.059 19.46 0.016 3.01 0.95 0.033 21.15 0.015 4.03 0.93
of parameters in the RBF also takes some time. The the low models. However, this is an acceptable trade-off between
training speed of PSO-BPNN and SVM-FA is that the models forecasting accuracy and computational efficiency because
are retrained when the parameters changed. Furthermore, the accuracy is improved at the cost of more computation
both ENN and RS-SRSCAD-FA compute slowly as their time. The correlation coefficients of the compared methods
model structures are more complex than those of other are given in the last column of the table. RS-SRSCAD-FA
in seconds), and correlation coefficient (𝑅) at Sites 1–8 for 24-hour ahead forecasting.
MAPE RMSE TIC CC 𝑅 MAPE RMSE TIC CC R

Sites 1 and 2
ENN 0.044 17.94 0.014 0.23 0.82 0.057 19.62 0.015 1.16 0.85
LSSVM 0.061 16.40 0.014 0.62 0.94 0.104 22.74 0.020 0.02 0.80
LASSO 0.047 20.05 0.019 0.50 0.75 0.055 21.70 0.017 0.42 0.81
SCAD 0.036 20.43 0.017 0.37 0.73 0.053 22.34 0.020 0.59 0.82
SRL 0.041 20.25 0.016 0.30 0.81 0.053 22.31 0.020 0.03 0.84
CS-SRPQVSP-QRBF 0.047 16.95 0.012 216.31 0.85 0.067 19.82 0.018 255.17 0.90
PSO-BPNN 0.053 19.75 0.015 200.01 0.84 0.065 19.98 0.018 211.24 0.87
SVM-FA 0.052 16.24 0.013 150.34 0.89 0.064 20.45 0.018 198.57 0.87
RS-SRSCAD-FA 0.044 13.11 0.010 1.47 0.96 0.069 16.90 0.014 1.90 0.95
Sites 3 and 4
ENN 0.057 18.92 0.014 1.39 0.88 0.060 17.87 0.013 1.38 0.82
LSSVM 0.129 21.34 0.017 0.02 0.79 0.118 21.87 0.016 0.04 0.78
LASSO 0.055 20.53 0.016 0.38 0.8 0.059 21.25 0.016 0.49 0.74
SCAD 0.056 20.83 0.016 0.55 0.89 0.060 20.82 0.015 0.56 0.8
SRL 0.057 20.78 0.017 0.34 0.87 0.059 21.11 0.016 0.42 0.77
CS-SRPQVSP-QRBF 0.064 18.05 0.014 177.89 0.89 0.060 15.73 0.012 233.37 0.90
PSO-BPNN 0.066 19.70 0.016 281.79 0.88 0.074 20.30 0.012 308.98 0.79
SVM-FA 0.064 18.64 0.014 202.40 0.92 0.073 19.55 0.014 182.97 0.85
RS-SRSCAD-FA 0.062 13.72 0.010 1.81 0.94 0.058 14.49 0.010 1.40 0.97
Sites 5 and 6
ENN 0.057 17.52 0.014 2.22 0.88 0.033 18.38 0.013 1.35 0.84
LSSVM 0.110 21.47 0.016 0.03 0.78 0.047 19.74 0.014 0.32 0.83
LASSO 0.056 20.29 0.015 0.53 0.79 0.039 20.54 0.014 0.56 0.77
SCAD 0.059 19.63 0.015 0.47 0.81 0.038 20.54 0.014 0.48 0.76
SRL 0.055 20.12 0.014 0.33 0.8 0.054 18.59 0.015 0.42 0.77
CS-SRPQVSP-QRBF 0.064 18.63 0.014 229.50 0.91 0.048 15.90 0.012 235.35 0.88
PSO-BPNN 0.070 18.97 0.015 289.92 0.89 0.065 19.94 0.015 350.75 0.86
SVM-FA 0.070 17.15 0.012 278.75 0.88 0.067 19.14 0.015 286.67 0.86
RS-SRSCAD-FA 0.056 13.16 0.011 1.84 0.92 0.048 14.71 0.011 1.90 0.89
Sites 7 and 8
ENN 0.038 17.58 0.012 1.84 0.84 0.031 19.31 0.012 1.28 0.78
LSSVM 0.705 22.66 0.014 0.02 0.73 0.038 21.59 0.014 0.02 0.72
LASSO 0.051 19.04 0.014 0.48 0.68 0.045 22.76 0.016 0.35 0.73
SCAD 0.051 19.37 0.014 0.52 0.69 0.037 21.13 0.014 0.59 0.81
SRL 0.051 19.23 0.014 0.35 0.74 0.040 21.12 0.013 0.34 0.81
CS-SRPQVSP-QRBF 0.051 15.46 0.011 217.58 0.89 0.024 14.79 0.009 272.25 0.89
PSO-BPNN 0.064 18.40 0.014 214.68 0.83 0.063 19.37 0.012 256.45 0.86
SVM-FA 0.065 17.58 0.014 282.36 0.85 0.060 17.38 0.010 342.09 0.88
RS-SRSCAD-FA 0.044 13.70 0.010 1.49 0.94 0.025 14.53 0.009 1.50 0.92
is still the best by giving the highest 𝑅 values. For instance, 𝑅 values of 0.97 for Site 1 and around 0.80 for the other
the 𝑅 values of RS-SRSCAD-FA are 0.98, 0.96, 0.92, 0.94, sites. The performances of the compared methods on cases
0.94, 0.91, 0.95, and 0.93 for Sites 1—8, respectively. The 24-hour ahead forecasting and 48-hour ahead forecasting
Angstrom–Prescott model also provides comparable 𝑅 val- are listed in Tables 3 and 4, respectively. The results are
ues, which are also higher than 0.90. LSSVM only produces consistent with what we have discussed above. It is observed
in seconds), and correlation coefficient (𝑅) at Sites 1–8 for 48-hour ahead forecasting.
MAPE RMSE TIC CC 𝑅 MAPE RMSE TIC CC 𝑅

Sites 1 and 2
ENN 0.059 24.91 0.019 0.45 0.81 0.078 24.53 0.018 1.79 0.82
LSSVM 0.081 22.78 0.02 1.21 0.92 0.142 28.43 0.024 0.03 0.78
LASSO 0.063 27.85 0.026 0.98 0.74 0.075 27.12 0.02 0.64 0.76
SCAD 0.048 28.37 0.023 0.72 0.72 0.073 27.92 0.023 0.91 0.81
SRL 0.054 28.12 0.022 0.58 0.80 0.073 27.89 0.023 0.04 0.82
CS-SRPQVSP-QRBF 0.063 23.54 0.017 424.13 0.84 0.092 24.78 0.021 392.57 0.89
PSO-BPNN 0.071 27.43 0.021 392.18 0.83 0.089 24.98 0.021 324.98 0.86
SVM-FA 0.069 22.56 0.018 294.78 0.87 0.087 25.56 0.021 305.49 0.86
RS-SRSCAD-FA 0.059 18.21 0.014 2.89 0.95 0.094 21.12 0.017 2.92 0.94
Sites 3 and 4
ENN 0.079 25.92 0.021 1.98 0.87 0.076 24.15 0.019 1.92 0.81
LSSVM 0.179 29.23 0.024 0.03 0.78 0.149 29.56 0.024 0.05 0.74
LASSO 0.077 28.12 0.023 0.54 0.76 0.075 28.72 0.023 0.68 0.72
SCAD 0.078 28.54 0.023 0.78 0.87 0.076 28.14 0.022 0.78 0.78
SRL 0.079 28.46 0.024 0.49 0.85 0.075 28.53 0.024 0.59 0.76
CS-SRPQVSP-QRBF 0.089 24.72 0.02 254.13 0.88 0.076 21.26 0.017 324.13 0.87
PSO-BPNN 0.092 26.98 0.023 402.56 0.87 0.094 27.43 0.018 429.14 0.78
SVM-FA 0.089 25.54 0.02 289.14 0.9 0.092 26.42 0.02 254.12 0.84
RS-SRSCAD-FA 0.086 18.79 0.014 2.59 0.91 0.074 19.58 0.015 1.94 0.96
Sites 5 and 6
ENN 0.072 23.68 0.02 3.09 0.87 0.042 24.84 0.019 1.87 0.82
LSSVM 0.139 29.01 0.024 0.04 0.76 0.059 26.67 0.021 0.45 0.80
LASSO 0.071 27.42 0.022 0.74 0.78 0.049 27.76 0.021 0.78 0.76
SCAD 0.075 26.53 0.022 0.65 0.79 0.048 27.76 0.021 0.67 0.74
SRL 0.069 27.19 0.021 0.46 0.78 0.068 25.12 0.022 0.59 0.76
CS-SRPQVSP-QRBF 0.081 25.18 0.02 318.75 0.89 0.061 21.49 0.017 326.87 0.87
PSO-BPNN 0.089 25.63 0.022 402.67 0.86 0.082 26.94 0.022 487.15 0.85
SVM-FA 0.088 23.18 0.018 387.15 0.85 0.085 25.87 0.022 398.15 0.85
RS-SRSCAD-FA 0.071 17.79 0.016 2.56 0.90 0.061 19.88 0.016 2.64 0.88
Sites 7 and 8
ENN 0.048 23.76 0.018 2.56 0.83 0.039 26.09 0.017 1.78 0.76
LSSVM 0.892 30.62 0.02 0.03 0.72 0.048 29.17 0.021 0.03 0.68
LASSO 0.065 25.73 0.021 0.67 0.68 0.057 30.76 0.023 0.48 0.71
SCAD 0.065 26.18 0.021 0.72 0.66 0.047 28.56 0.02 0.82 0.79
SRL 0.064 25.98 0.02 0.48 0.72 0.05 28.54 0.019 0.47 0.79
CS-SRPQVSP-QRBF 0.064 20.89 0.016 302.19 0.87 0.031 19.98 0.013 378.12 0.88
PSO-BPNN 0.081 24.87 0.021 298.17 0.82 0.08 26.18 0.018 356.18 0.84
SVM-FA 0.082 23.75 0.02 392.17 0.84 0.076 23.48 0.015 475.13 0.86
RS-SRSCAD-FA 0.056 18.51 0.014 2.07 0.93 0.032 19.63 0.013 2.09 0.91
that RS-SRSCAD-FA provides remarkable results followed Sites 1—8, respectively. The correlation coefficients of RS-
by the Angstrom–Prescott model and CS-SRPQVSP-QRBF. SRSCAD-FA are given around 0.90, which are also higher
The RMSEs of RS-SRSCAD-FA for 24 hours ahead are 13.11, than other methods. The RMSEs of LASSO, SCAD, and SRL
16.90, 13.72, 14.49, 13.16, 14.71, 13.70, and 14.53 W/m2 for are approximately 20–21 𝑊/𝑚2 , which are comparable to but
0.25 0.25 0.25 0.25
0.2 0.2 0.2 0.2
0.15 0.15 0.15 0.15
MAPE
MAPE
MAPE
MAPE
0.1 0.1 0.1 0.1
0.05 0.05 0.05 0.05
0 0 0 0
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
MAPE Site 1 Site 2 Site 3 Site 4
0.25 0.25 0.25 0.25
0.2 0.2 0.2 0.2
0.15 0.15 0.15 0.15
MAPE
MAPE
MAPE
MAPE
0.1 0.1 0.1 0.1
0.05 0.05 0.05 0.05
0 0 0 0
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
Site 5 Site 6 Site 7 Site 8
50 50 50 50
45 45 45 45
40 40 40 40
35 35 35 35
30 30 30 30
RMSE
RMSE
RMSE
RMSE
25 25 25 25
20 20 20 20
15 15 15 15
10 10 10 10
5 5 5 5
0 0 0 0
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
RMSE Site 1 Site 2 Site 3 Site 4
50 50 50 50
45 45 45 45
40 40 40 40
35 35 35 35
30 30 30 30
RMSE
RMSE
RMSE
RMSE
25 25 25 25
20 20 20 20
15 15 15 15
10 10 10 10
5 5 5 5
0 0 0 0
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
0.04 0.04 0.04 0.04

0.035 0.035 0.035 0.035
0.03 0.03 0.03 0.03
0.025 0.025 0.025 0.025
0.02 0.02 0.02
TIC
TIC
0.02
TIC
TIC
0.015 0.015 0.015 0.015

0.01 0.01 0.01 0.01
0.005 0.005 0.005 0.005
0 0 0 0
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
TIC 0.04 0.04 0.04 0.04
0.035 0.035 0.035 0.035
0.03 0.03 0.03 0.03
0.025 0.025 0.025 0.025
0.02 0.02
TIC
TIC
0.02 0.02
TIC
TIC
0.015 0.015 0.015 0.015

0.01 0.01 0.01 0.01
0.005 0.005 0.005 0.005
0 0 0 0
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
Figure 4: Boxplots of MAPE, RMSE, and TIC at Sites 1–8. The compared models ENN, LSSVM, LASSO, SCAD, SRL, Angstrom-Prescott,
CS-SRPQVSP-QRBF, PSO-BPNN, SVM-FA, and RS-SRSCAD-FA are indicated by numbers 1–10, respectively (the 𝑦-axis is truncated to more
effectively display the remaining boxplots).
worse than those of CS-SRPQVSP-QRBF, PSO-BPNN, and The boxplots shown in Figure 4 clearly demonstrate that
SVM-FA (approximately 14–18 W/m2 ). Similar phenomena the forecasting values of RS-SRSCAD-FA are critically lower
can also be seen in Table 4 for the case of 48-hour ahead and a few outliers are observed (the red lines represent the
forecasting. median values). Figures 5 and 6 exhibit the relationship
Actual solar Actual solar Actual solar Actual solar Actual solar
radiation radiation radiation radiation radiation
(W/m ∧2) (W/m ∧2) (W/m ∧2) (W/m ∧2) (W/m ∧2)
1000 1000 1000 1000 1000
global solar
radiation
Forecast
(W/m2 )
500 500 500 500 500
0 0 0 0 0
0 500 1000 0 500 1000 0 500 1000 0 500 1000 0 500 1000
Site 1 ENN LSSVM LASSO SCAD SRL
1000 1000 1000 1000 1000
global solar
radiation
Forecast
(W/m2)
500 500 500 500 500
0 0 0 0 0
0 500 1000 0 500 1000 0 500 1000 0 500 1000 0 500 1000
Angstrom-Prescott CS-SRPQVSP-QRBF PSO-BPNN SVM-FA RS-SRSCAD-FA
1000 1000 1000 1000 1000

global solar
radiation
(W/m2)
Forecast
500 500 500 500 500
0 0 0 0 0
0 500 1000 0 500 1000 0 500 1000 0 500 1000 0 500 1000
1000 1000 1000 1000 1000
global solar
radiation
Forecast
(W/m2)
500 500 500 500 500
0 0 0 0 0
0 500 1000 0 500 1000 0 500 1000 0 500 1000 0 500 1000
1000 1000 1000 1000 1000

global solar
radiation
Forecast
(W/m2)
500 500 500 500 500
0 0 0 0 0
0 500 1000 0 500 1000 0 500 1000 0 500 1000 0 500 1000
1000 1000 1000 1000 1000
global solar
radiation
Forecast
(W/m2)
500 500 500 500 500
0 0 0 0 0
0 500 1000 0 500 1000 0 500 1000 0 500 1000 0 500 1000
1000 1000 1000 1000 1000

global solar
radiation
Forecast
(W/m2)
500 500 500 500 500
0 0 0 0 0
0 500 1000 0 500 1000 0 500 1000 0 500 1000 0 500 1000
1000 1000 1000 1000 1000
global solar
radiation
Forecast
(W/m2)
500 500 500 500 500
0 0 0 0 0
0 500 1000 0 500 1000 0 500 1000 0 500 1000 0 500 1000
Figure 5: Plot of forecasting values and actual values of global solar radiation at Sites 1–4.
between the measured data and the estimated values in better forecasting results than the other penalized feature
the solar radiation datasets. Apparently, the points of RS- selection methods. Figure 7 exhibits the errors between actual
SRSCAD-FA are closer to the straight line, which demon- values and forecast values provided by the proposed RS-
strates its satisfactory performances. ENN demonstrates SRSCAD-FA. It is obvious that the RS-SRSCAD-FA values
Actual solar Actual solar Actual solar Actual solar Actual solar
radiation radiation radiation radiation radiation
(W/m ∧2) (W/m ∧2) (W/m ∧2) (W/m ∧2) (W/m ∧2)
1000 1000 1000 1000 1000
global solar
radiation
Forecast
(W/m2 )
500 500 500 500 500
0 0 0 0 0
Site 5 0 500 1000 0 500 1000 0 500 1000 0 500 1000 0 500 1000
ENN LSSVM LASSO SCAD SRL
1000 1000 1000 1000 1000
global solar
radiation
Forecast
(W/m2 )
500 500 500 500 500
0 0 0 0 0
0 500 1000 0 500 1000 0 500 1000 0 500 1000 0 500 1000
1000 1000 1000 1000 1000

global solar
radiation
Forecast
(W/m2 )
500 500 500 500 500
0 0 0 0 0
0 500 1000 0 500 1000 0 500 1000 0 500 1000 0 500 1000
Site 6
ENN LSSVM LASSO SCAD SRL
1000 1000 1000 1000 1000
global solar
radiation
Forecast
(W/m2 )
500 500 500 500 500
0 0 0 0 0
0 500 1000 0 500 1000 0 500 1000 0 500 1000 0 500 1000
1000 1000 1000 1000 1000

global solar
radiation
Forecast
(W/m2 )
500 500 500 500 500
0 0 0 0 0
0 500 1000 0 500 1000 0 500 1000 0 500 1000 0 500 1000
1000 1000 1000 1000 1000
global solar
radiation
Forecast
(W/m2 )
500 500 500 500 500
0 0 0 0 0
0 500 1000 0 500 1000 0 500 1000 0 500 1000 0 500 1000
1000 1000 1000 1000 1000

global solar
radiation
Forecast
(W/m2 )
500 500 500 500 500
0 0 0 0 0
0 500 1000 0 500 1000 0 500 1000 0 500 1000 0 500 1000
1000 1000 1000 1000 1000
global solar
radiation
Forecast
(W/m2 )
500 500 500 500 500
0 0 0 0 0
0 500 1000 0 500 1000 0 500 1000 0 500 1000 0 500 1000
Figure 6: Plot of forecasting values and actual values of global solar radiation at Sites 5–8.
closely match the actual data in almost all time periods. That algorithm with other existing algorithm are more reliable
is, RS-SRSCAD-FA consistently displays the best forecasting and trustworthy on multiple datasets than on a single dataset
performance over a vast majority of the time period. [61]. To make a fair comparison, a nonparametric Friedman
test is applied to evaluate the forecasting and selection
4.5. Statistical Comparisons of Results over Multiple Real performances of the algorithms in this global solar radiation
Datasets. Statistical tests for comparison of the proposed datasets. The Friedman test examines whether the measured
Site 1 Site 2
100 100
(W/Ｇ2 )
(W/Ｇ2 )
0 0
−100 −100
0 100 200 300 400 500 600 700 800 0 100 200 300 400 500 600 700 800
Site 3 Site 4
100 100
(W/Ｇ2 )
(W/Ｇ2 )
0 0
−100 −100
0 100 200 300 400 500 600 700 800 0 100 200 300 400 500 600 700 800
Site 5 Site 6
100 100
(W/Ｇ2 )
(W/Ｇ2 )
0 0
−100 −100
0 100 200 300 400 500 600 700 800 0 100 200 300 400 500 600 700 800
Site 7 Site 8
100 100
(W/Ｇ2 )
(W/Ｇ2 )
0 0
−100 −100
0 100 200 300 400 500 600 700 800 0 100 200 300 400 500 600 700 800
Error between actual data and forecast values of RS-SRSCAD-FA
Figure 7: Plot of errors between actual global solar radiation and forecast values of RS-SRSCAD-FA at Sites 1–8.
average rank is significantly different from the mean rank that does not employ all the covariates. Therefore, RS-SRSCAD-
is likely under the null hypothesis. Table 5 reveals the ranks of FA exhibits a higher prediction capacity relative to other
the MAPEs, RMSEs, and TICs of the proposed approach and forecasting methods.
of the other competitors. It is convenient to calculate the 𝜒𝐹2
and 𝐹𝐹 values as shown in the table. Using the six approaches 5. Conclusions
and eight datasets, 𝐹𝐹 is distributed according to 8 − 1 = 7
and (10 − 1) ∗ (6 − 1) = 45 degrees of freedom. The critical Forecasting solar radiation is fundamental to solar power
value for the statistic 𝐹(7, 45) for 𝛼 = 0.1 is 1.85; therefore, technology. For the utilization and conversion of solar power,
the null hypothesis is rejected, which indicates that there accurate and continuous radiation data is essential. Therefore,
are differences between these algorithms and more tests are establishing an accurate and efficient forecasting model plays
needed. a vital role in global solar radiation forecasting. To overcome
To confirm the advantage, the newly proposed RS- the drawbacks of a single model, which yields low forecasting
SRSCAD-FA method is regarded as the control algorithm, accuracy, a novel global solar radiation forecasting method
and the other competitors are tested against it. The most called RS-SRSCAD-FA has been proposed. The main struc-
straightforward method is to calculate critical difference ture of the novel method is ensemble learning for enhancing
(CD) with the Bonferroni–Dunn test [62]. According to [61], the forecasting accuracy and model stability. An efficient
the critical value for 𝑞0.1 for the 10 methods is 2.326. Thus, covariate selection method SRSCAD is used in the input
CD is 2.326 × √(10 × 11)/(6 × 8) = 3.52. It is evident that the space; simultaneously, a weight vector, which best represents
forecasting models except LSSVM, PSO-BPNN, and SVM- the importance of each individual model in the ensemble
FA yield comparable MAPEs as the differences between their system, is determined by the FA. To illustrate the validity
average ranks are less than the CD value 3.52. For example, the and superiority of the proposed method, the datasets from
difference between SRL and RS-SRSCAD-FA is less than 3.52 eight locations in the Xinjiang area of China have been
(4.31 − 3.25 = 1.06 < 3.52). In terms of RMSE, RS-SRSCAD- applied as real data examples. The results demonstrate that
FA performs better than LSSVM, LASSO, and SCAD, and the proposed RS-SRSCAD-FA method is superior to other
SRL and is comparable with ENN, Angstrom–Prescott, CS- forecasting approaches.
SRPQVSP-QRBF, and SVM-FA based on the differences
between the average ranks. For instance, the RMSE dif- Data Availability
ferences between LASSO and RS-SRSCAD-FA are larger
than 3.52 (7.81 − 1.00 = 6.81 > 3.52). The differences The data used to support the findings of this study are
between Angstrom–Prescott and RS-SRSCAD-FA are less included within the article.
than 3.52 (2.38 − 1.00 = 1.38 < 3.52). Similar phenomena
can be observed from the aspect of TIC and 𝑅 values: Conflicts of Interest
most of the results of the nonparametric Freidman test are
consistent with our observations. Unlike ENN and SVM- The authors declare that the received funding did not lead
FA, RS-SRSCAD-FA provides an interpretable model that to any conflicts of interest regarding the publication of this
Table 5: Ranks of forecasting models for Sites 1–8 of global solar radiation data.
Methods Site 1 Site 2 Site 3 Site 4 Site 5 Site 6 Site 7 Site 8 Average rank 𝜒𝐹2 𝐹𝐹
MAPE
ENN 2 4 3.5 5 3 1 1.5 3 2.88 46.23 12.56
LSSVM 10 10 10 10 10 6 10 6 9.00
LASSO 1 3 1 2 5 4.5 5.5 7 3.63
SCAD 5 1.5 2 5 4 4.5 4 4 3.75
SRL 5 1.5 3.5 5 2 7 5.5 5 4.31
Angstrom–Prescott 7 5 5 7 7 8 7 8.5 6.81
CS-SRPQVSP-QRBF 5 8 7 2 6 3 3 1.5 4.44
PSO-BPNN 8 7 9 9 9 10 8.5 10 8.81
SVM-FA 9 6 8 8 8 9 8.5 8.5 8.13
RS-SRSCAD-FA 3 9 6 2 1 2 1.5 1.5 3.25
RMSE
ENN 5 4 4 4 3 4 4 5 4.13 61.63 41.60
LSSVM 2 10 10 10 10 7 10 9 8.50
LASSO 8 7 7 7 7 9.5 7 10 7.81
SCAD 9 8 8 9 8 9.5 9 7 8.44
SRL 10 9 9 8 9 8 8 8 8.63
Angstrom–Prescott 3 2 2 3 2 2 2 3 2.38
CS-SRPQVSP-QRBF 6 3 3 2 5 3 3 2 3.38
PSO-BPNN 7 6 6 6 6 6 6 6 6.13
SVM-FA 4 5 5 5 4 5 5 4 4.63
RS-SRSCAD-FA 1 1 1 1 1 1 1 1 1.00
TIC
ENN 6 3.5 4.5 4.5 2.5 4 4.5 5 4.31 59.57 33.56
LSSVM 2.5 10 10 10 10 8.5 10 9 8.75
LASSO 8 7 7.5 8 7.5 8.5 8 10 8.06
SCAD 8 8.5 7.5 8 7.5 8.5 8 7.5 7.94
SRL 8 8.5 9 8 7.5 8.5 8 7.5 8.13
Angstrom–Prescott 4 2 2 3 2.5 3 2.5 3 2.75
CS-SRPQVSP-QRBF 2.5 3.5 3 2 4.5 2 2.5 1.5 2.69
PSO-BPNN 10 5.5 6 6 7.5 5.5 6 6 6.56
SVM-FA 5 5.5 4.5 4.5 4.5 5.5 4.5 4 4.75
RS-SRSCAD-FA 1 1 1 1 1 1 1 1.5 1.06
𝑅
ENN 6 7 9 4 3.5 6 4 8 5.94 56.57 25.67
LSSVM 2 8 10 10 10 8 8 10 8.25
LASSO 9 9 8 9 8.5 9.5 9 9 8.88
SCAD 10 10 5 6.5 8.5 9.5 10 7 8.31
SRL 5 6 6.5 8 7 7 7 6 6.56
Angstrom–Prescott 3 2 2 2 2 2 2 2 2.13
CS-SRPQVSP-QRBF 7 3 3 3 3.5 3 3 3 3.56
PSO-BPNN 8 4.5 6.5 6.5 6 4.5 6 5 5.88
SVM-FA 4 4.5 4 5 5 4.5 5 4 4.50
RS-SRSCAD-FA 1 1 1 1 1 1 1 1 1.00
manuscript and there are no conflicts of interest regarding the 71861012), the Natural Science Foundation of Jiangxi,
publication of this article. China (Grant Nos. 20181BAB211020 and 20171BAA218001),
the China Postdoctoral Science Foundation (Grant Nos.
Acknowledgments 2017M620277 and 2018T110654), and Scientific Research
This research was supported by the National Natural Fund of Jiangxi Provincial Education Department (Grant No.
Science Foundation of China (Grant Nos. 71761016 and GJJ180287).
References International Journal of Climatology, vol. 32, no. 2, pp. 274–285,

2012.
[1] H. Jiang, Y. Dong, J. Wang, and Y. Li, “Intelligent optimization [19] J.-L. Chen, G.-S. Li, and S.-J. Wu, “Assessing the potential of
models based on hard-ridge penalty and RBF for forecasting support vector machine for estimating daily solar radiation
global solar radiation,” Energy Conversion and Management, vol. using sunshine duration,” Energy Conversion and Management,
95, pp. 42–58, 2015. vol. 75, pp. 311–318, 2013.
[2] R. Calif, F. G. Schmitt, Y. Huang, and T. Soubdhan, “Intermit- [20] B. B. Ekici, “A least squares support vector machine model for
tency study of high frequency global solar radiation sequences prediction of the next day solar insolation for effective use of PV
under a tropical climate,” Solar Energy, vol. 98, pp. 349–365, systems,” Measurement, vol. 50, no. 1, pp. 255–262, 2014.
2013.
[21] S. Cao and J. Cao, “Forecast of solar irradiance using recurrent
[3] R. Calif, F. G. Schmitt, and O. Durán, Medina. 5/3 Kolmogorov neural networks combined with wavelet analysis,” Applied
Turbulent Behaviour and Intermittent Sustainable Energies, Thermal Engineering, vol. 25, no. 2-3, pp. 161–172, 2005.
2017.
[22] H. Jiang and Y. Dong, “Global horizontal radiation forecast
[4] Z. Zeng, H. Yang, R. Zhao, and J. Meng, “Nonlinear character- using forward regression on a quadratic kernel support vector
istics of observed solar radiation data,” Solar Energy, vol. 87, no. machine: case study of the Tibet autonomous region in China,”
1, pp. 204–218, 2013. Energy, vol. 133, pp. 270–283, 2017.
[5] S. Hussain and A. A. Alili, “A pruning approach to optimize [23] A. Rohani, M. Taki, and M. Abdollahpour, “A novel soft com-
synaptic connections and select relevant input parameters for puting model (Gaussian process regression with K-fold cross
neural network modelling of solar radiation,” Applied Soft validation) for daily and monthly solar radiation forecasting
Computing, vol. 52, pp. 898–908, 2017. (Part: I),” Journal of Renewable Energy, vol. 115, pp. 411–422,
[6] C. Voyant, G. Notton, S. Kalogirou et al., “Machine learning 2018.
methods for solar radiation forecasting: a review,” Journal of [24] M. Guermoui, K. Gairaa, A. Rabehi, D. Djafer, and S. Benka-
Renewable Energy, vol. 105, pp. 569–582, 2017. ciali, “Estimation of the daily global solar radiation based on
[7] Y. Kashyap, A. Bansal, and A. K. Sao, “Solar radiation forecast- the Gaussian process regression methodology in the Saharan
ing with multiple parameters neural networks,” Renewable & climate,” The European Physical Journal Plus, vol. 133, no. 6, p.
Sustainable Energy Reviews, vol. 49, pp. 825–835, 2015. 211, 2018.
[8] K. Gairaa, A. Khellaf, Y. Messlem, and F. Chellali, “Estimation of [25] M. Guermoui, F. Melgani, and C. Danilo, “Multi-step ahead
the daily global solar radiation based on Box-Jenkins and ANN forecasting of daily global and direct solar radiation: a review
models: a combined approach,” Renewable & Sustainable Energy and case study of Ghardaia region,” Journal of Cleaner Produc-
Reviews, vol. 57, pp. 238–249, 2016. tion, vol. 201, pp. 716–734, 2018.
[9] M. Ozgoren, M. Bilgili, and B. Sahin, “Estimation of global [26] J. Wang, Y. Xie, C. Zhu, and X. Xu, “Solar radiation prediction
solar radiation using ANN over Turkey,” Expert Systems with based on phase space reconstruction of wavelet neural net-
Applications, vol. 39, no. 5, pp. 5043–5051, 2012. work,” Procedia Engineering, vol. 150, no. 9, pp. 4603–4607, 2011.
[10] H. A. N. Hejase, M. H. Al-Shamisi, and A. H. Assi, “Modeling of [27] S. Monjoly, M. André, R. Calif, and T. Soubdhan, “Hourly
global horizontal irradiance in the United Arab Emirates with forecasting of global solar radiation based on multiscale decom-
artificial neural networks,” Energy, vol. 77, pp. 542–552, 2014. position methods: a hybrid approach,” Energy, vol. 119, pp. 288–
[11] C. Renno, F. Petito, and A. Gatto, “ANN model for predicting 298, 2017.
the direct normal irradiance and the global radiation for a [28] S. Hussain and A. AlAlili, “A hybrid solar radiation modeling
solar application to a residential building,” Journal of Cleaner approach using wavelet multiresolution analysis and artificial
Production, vol. 135, pp. 1298–1316, 2016. neural networks,” Applied Energy, vol. 208, pp. 540–550, 2017.
[12] R. C. Deo, X. Wen, and F. Qi, “A wavelet-coupled support vector [29] M. A. Hassan, A. Khalil, S. Kaseb, and M. A. Kassem, “Exploring
machine model for forecasting global incident solar radiation the potential of tree-based ensemble methods in solar radiation
using limited meteorological dataset,” Applied Energy, vol. 168, modeling,” Applied Energy, vol. 203, pp. 897–916, 2017.
pp. 568–593, 2016. [30] J.-J. Wang, J.-Z. Wang, Z.-G. Zhang, and S.-P. Guo, “Stock index
[13] P. Coulibaly and N. D. Evora, “Comparison of neural network forecasting based on a hybrid model,” Omega , vol. 40, no. 6, pp.
methods for infilling missing daily weather records,” Journal of 758–766, 2012.
Hydrology, vol. 341, no. 1-2, pp. 27–41, 2007. [31] W. G. Zhao, J. Z. Wang, and H. Y. Lu, “Combining forecasts
[14] V. N. Vapnik, The Nature of Statistical Learning Theory, Springer, of electricity consumption in China with time-varying weights
1995. updated by a high-order Markov chain model,” Omega, vol. 450,
[15] M. A. Hearst, S. T. Dumais, E. Osuna, J. Platt, and B. Scholkopf, no. 45, pp. 80–91, 2014.
“Support vector machines,” IEEE Intelligent Systems and Their [32] Y. Gala, Á. Fernández, J. Dı́az, and J. R. Dorronsoro, “Hybrid
Applications, vol. 130, no. 4, pp. 18–28, 2002. machine learning forecasting of solar radiation values,” Neuro-
[16] P.-S. Yu, S.-T. Chen, and I.-F. Chang, “Support vector regression computing, vol. 176, pp. 48–59, 2016.
for real-time flood stage forecasting,” Journal of Hydrology, vol. [33] H. Sun, D. Gui, B. Yan et al., “Assessing the potential of random
328, no. 3-4, pp. 704–716, 2006. forest method for estimating solar radiation using air pollution
[17] B. Zhu and Y. Wei, “Carbon price forecasting with a novel hybrid index,” Energy Conversion and Management, vol. 119, pp. 121–
ARIMA and least squares support vector machines method- 129, 2016.
ology,” OMEGA - The International Journal of Management [34] R. Aler, I. M. Galván, J. A. Ruiz-Arias, and C. A. Gueymard,
Science, vol. 41, no. 3, pp. 517–524, 2012. “Improving the separation of direct and diffuse solar radiation
[18] W. Wu and H.-B. Liu, “Assessment of monthly solar radiation components using machine learning by gradient boosting,”
estimates using support vector machines and air temperatures,” Solar Energy, vol. 150, pp. 558–569, 2017.
[35] R. Tibshirani, “Regression shrinkage and selection via the [56] H. Jiang, “A novel approach for forecasting global horizontal
lasso,” Journal of the Royal Statistical Society: Series B (Statistical irradiance based on sparse quadratic RBF neural network,”
Methodology), vol. 58, no. 1, pp. 267–288, 1996. Energy Conversion and Management, vol. 152, pp. 266–280, 2017.
[36] B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani, “Least angle [57] M. A. Mohandes, “Modeling global solar radiation using Parti-
regression,” The Annals of Statistics, vol. 32, no. 2, pp. 407–499, cle Swarm Optimization (PSO),” Solar Energy, vol. 86, no. 11, pp.
2004. 3137–3145, 2012.
[37] H. Zou and T. Hastie, “Regularization and variable selection [58] L. Olatomiwa, S. Mekhilef, S. Shamshirband, K. Mohammadi,
via the elastic net,” Journal of the Royal Statistical Society B: D. Petković, and C. Sudheer, “A support vector machine-firefly
Statistical Methodology, vol. 67, no. 2, pp. 301–320, 2005. algorithm-based model for global solar radiation prediction,”
[38] H. Zou, “The adaptive lasso and its oracle properties,” Journal of Solar Energy, vol. 115, pp. 632–644, 2015.
the American Statistical Association, vol. 101, no. 476, pp. 1418– [59] H. Akaike, “A new look at the statistical model identification,”
1429, 2006. IEEE Transactions on Automatic Control, vol. 19, no. 6, pp. 716–
[39] D. Yang, Z. Ye, L. H. I. Lim, and Z. Dong, “Very short term 723, 1974.
irradiance forecasting using the lasso,” Solar Energy, vol. 114, pp. [60] A. Belloni, V. Chernozhukov, and L. Wang, “Square-root lasso:
314–326, 2015. pivotal recovery of sparse signals via conic programming,”
[40] T. Zhang, “Multi-stage convex relaxation for feature selection,” Biometrika, vol. 98, no. 4, pp. 791–806, 2011.
Bernoulli, vol. 19, no. 5, pp. 2277–2293, 2013. [61] A. Janez, “Statistical comparisons of classifiers over multiple
[41] K. Knight and W. Fu, “Asymptotics for lasso-type estimators,” data sets,” Journal of Machine Learning Research, vol. 70, no. 1,
The Annals of Statistics, vol. 28, no. 5, pp. 1356–1378, 2000. pp. 1–30, 2006.
[42] J. Fan and R. Li, “Variable selection via nonconcave penalized [62] O. J. Dunn, “Multiple comparisons among means,” Journal of
likelihood and its oracle properties,” Journal of the American the American Statistical Association, vol. 56, no. 293, pp. 52–64,
Statistical Association, vol. 96, no. 456, pp. 1348–1360, 2001. 1961.
[43] P. Zhao and B. Yu, “On model selection consistency of Lasso,”
Journal of Machine Learning Research, vol. 7, no. 12, pp. 2541–
2563, 2006.
[44] K. Lounici, “Sup-norm convergence rate and sign concentration
property of Lasso and Dantzig estimators,” Electronic Journal of
Statistics, vol. 2, pp. 90–102, 2008.
[45] G. E. P. Box, “Science and statistics,” Journal of the American
Statistical Association, vol. 71, no. 356, pp. 791–799, 1976.
[46] J. M. Bates and C. W. J. Granger, “The combination of forecasts,”
Operational Research Quarterly, vol. 20, no. 4, pp. 451–468, 1969.
[47] T. K. Ho, “The random subspace method for constructing
decision forests,” IEEE Transactions on Pattern Analysis and
Machine Intelligence, vol. 20, no. 8, pp. 832–844, 1998.
[48] T. K. Ho, “Random decision forests,” in Proceedings of the 3rd
International Conference on Document Analysis and Recogni-
tion, pp. 278–282, Montreal, Canada, 1995.
[49] X. She Yang, “Firefly algorithm, lévy flights and global opti-
mization. Research and development in intelligent systems,”
Research and Development in Intelligent Systems XXVI, vol. 26,
pp. 209–218, 2010.
[50] O. P. Verma, D. Aggarwal, and T. Patodi, “Opposition and
dimensional based modified firefly algorithm,” Expert Systems
with Applications, vol. 44, pp. 168–176, 2016.
[51] H. Wang, W. Wang, X. Zhou et al., “Firefly algorithm with
neighborhood attraction,” Information Sciences, vol. 382-383,
pp. 374–387, 2017.
[52] X. She Yang, Nature-Inspired Metaheuristic Algorithms, Luniver
Press, 2008.
[53] Y. She, “Thresholding-based iterative selection procedures for
model selection and shrinkage,” Electronic Journal of Statistics,
vol. 3, pp. 384–415, 2009.
[54] Yu. Nesterov, “Gradient methods for minimizing composite
objective function,” Technical Report, Université catholique de
Louvain, Center for Operations Research and Econometrics
(CORE), 2007.
[55] L. Feng, A. Lin, L. Wang, W. Qin, and W. Gong, “Evaluation
of sunshine-based models for predicting diffuse solar radiation
in China,” Renewable & Sustainable Energy Reviews, vol. 94, pp.
168–182, 2018.
Advances in Advances in Journal of The Scientific Journal of
Operations Research
Hindawi
Decision Sciences
Hindawi
Applied Mathematics
Hindawi
World Journal
Hindawi Publishing Corporation
Probability and Statistics
Hindawi
www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 http://www.hindawi.com
www.hindawi.com Volume 2018
2013 www.hindawi.com Volume 2018
International
Journal of
Mathematics and
Mathematical
Sciences
Journal of
Hindawi
Optimization
Hindawi
www.hindawi.com Volume 2018 www.hindawi.com Volume 2018
Submit your manuscripts at

www.hindawi.com
International Journal of
Engineering International Journal of
Mathematics
Hindawi
Analysis
Hindawi
www.hindawi.com Volume 2018 www.hindawi.com Volume 2018
Journal of Advances in Mathematical Problems International Journal of Discrete Dynamics in

Complex Analysis
Hindawi
Numerical Analysis
Hindawi
in Engineering
Hindawi
Differential Equations
Hindawi
Nature and Society
Hindawi
www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018
International Journal of Journal of Journal of Abstract and Advances in

Stochastic Analysis
Hindawi
Mathematics
Hindawi
Function Spaces
Hindawi
Applied Analysis
Hindawi
Mathematical Physics
Hindawi
www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018

Research Article: Global Solar Radiation Forecasting Using Square Root Regularization-Based Ensemble

Uploaded by

Copyright:

Available Formats

Research Article: Global Solar Radiation Forecasting Using Square Root Regularization-Based Ensemble

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Research Article: Global Solar Radiation Forecasting Using Square Root Regularization-Based Ensemble

Uploaded by

Copyright:

Available Formats

Hindawi

Mathematical Problems in Engineering

Correspondence should be addressed to He Jiang; jiangsky2005@aliyun.com

Academic Editor: Bogdan Dumitrescu

1. Introduction challenges for the incorporation of solar energy sources into

Table 1: Nomenclature. covariates. Therefore, how to extract the important covariates

Figure 1: Flowchart of the RS-SRSCAD-FA model.

Inputs: 𝑍 = (𝑋, 𝑦) = {(̃ 𝑥1 , 𝑦1 ), ⋅ ⋅ ⋅ , (̃

17: Step 4. 𝜔 (𝑡+1)

Algorithm 1: The RS-SRSCAD-FA algorithm.

Site 4 Site 6 Site 8

Site 1 Site 2 Site 3 Site 4 Site 5 Site 6 Site 7 Site 8

solar radiation (W/m2 )

Relative humidity (%)

MAPE RMSE TIC CC 𝑅 MAPE RMSE TIC CC 𝑅

MAPE RMSE TIC CC 𝑅 MAPE RMSE TIC CC R

MAPE RMSE TIC CC 𝑅 MAPE RMSE TIC CC 𝑅

0.25 0.25 0.25 0.25

0.2 0.2 0.2 0.2

0.15 0.15 0.15 0.15

0.05 0.05 0.05 0.05

0.2 0.2 0.2 0.2

0.15 0.15 0.15 0.15

0.05 0.05 0.05 0.05

0.04 0.04 0.04 0.04

0.015 0.015 0.015 0.015

0.015 0.015 0.015 0.015

1000 1000 1000 1000 1000

500 500 500 500 500

500 500 500 500 500

1000 1000 1000 1000 1000

500 500 500 500 500

500 500 500 500 500

1000 1000 1000 1000 1000

500 500 500 500 500

500 500 500 500 500

1000 1000 1000 1000 1000

500 500 500 500 500

500 500 500 500 500

1000 1000 1000 1000 1000

500 500 500 500 500

500 500 500 500 500

1000 1000 1000 1000 1000

500 500 500 500 500

500 500 500 500 500

References International Journal of Climatology, vol. 32, no. 2, pp. 274–285,

Submit your manuscripts at

Journal of Advances in Mathematical Problems International Journal of Discrete Dynamics in

International Journal of Journal of Journal of Abstract and Advances in

You might also like