2015-Elsevier-A new data classification method based on chaotic particle swarm optimization and least square-support vector machine
2015-Elsevier-A new data classification method based on chaotic particle swarm optimization and least square-support vector machine
a r t i c l e i n f o a b s t r a c t
Article history: In order to improve the classification accuracy in chemometrics data, chaotic optimization algorithm (COA) and
Received 3 June 2015 particle swarm optimization (PSO) algorithm are introduced into least square-support vector machine (LS-SVM)
Received in revised form 28 July 2015 model in order to propose an optimized LS-SVM model and a novel data classification (CPL-SVM) method in this
Accepted 17 August 2015
paper. In the proposed CPL-SVM method, the COA with the randomness and ergodicity is used to chaotically
Available online 23 August 2015
process the initial position and local extreme position of particle in the PSO algorithm in order to obtain a chaotic
Keywords:
particle swarm optimization (CPSO) algorithm, and the CPSO is used to select and optimize the important param-
Chaotic optimization algorithm eters of LS-SVM, then the optimized parameters are used to obtain a better CPL-SVM classification method. The
Particle swarm optimization algorithm choice randomness of parameters is avoided and the selection workload of parameters is reduced. And this meth-
Least square-support vector machine (LS-SVM) od can not only overcome the time-consuming and blindness of cross validation method, but also reflect small
Data classification sample learning ability. In order to verify the effectiveness of CPL-SVM method, binary classification data, IRIS
Generalization ability flower data and three relevant data sets with pharmacodynamic properties of drug are selected in this paper.
Classification accuracy The experiment results show that the proposed CPL-SVM method takes on the better learning performance,
strong generalization ability, best sensitivity, Matthews correlation coefficient and classification accuracy. And
it can effectively avoid the isolated effects of sample in the learning process.
© 2015 Elsevier B.V. All rights reserved.
1. Introduction This model is a convex optimization problem to find out the global op-
timization solution. It can effectively solve these complex problems
Chemometrics has been defined as the chemical discipline that uses with the small sample, nonlinear, local minimum, avoid the slow
mathematical and statistical methods to design and select optimal convergence speed, and easily fall into the local minimum value. On
procedures and experiments, and provide the maximum chemical in- the basis of SVM model, Suykens et al. [2,3] proposed a least square-
formation based on analyzing chemical data. The most prominent part support vector machine (LS-SVM) method. The LS-SVM model is to
of chemometrics is data classification by using some intelligent methods transform the SVM from the quadratic programming problem into the
for all obtained data. The chemometrics has mainly involved the infor- linear equations in order to reduce the computational complexity and
mation extraction from these obtained data. The available data and improve the calculation speed in processing large sample. The LS-SVM
desired information often exist the hidden relationship, the analysis has been widely applied in the pattern recognition, data mining,
goal of chemometrics is to find out some relationships and classify image analysis, network security and so on. However, the optimization
these data by using new intelligent methods. These intelligent methods parameters of LS-SVM model have an important influence on its optimi-
include neural networks (NN), genetic algorithm (GA), simulated an- zation performance and learning precision. So how to optimize param-
nealing (SA), particle swarm optimization (PSO) algorithm, statistical eters of LS-SVM model is an important research problem in machine
analysis, support vector machine (SVM) and so on. However, the data learning.
of chemometrics have mostly multi-factors, high noise, nonlinear and Chaotic optimization algorithm (COA) with the randomness and er-
irregular, so these complex data are classified in order to discovery the godicity is introduced into the PSO algorithm in order to make up for the
interdependent relationships and extract data model among these low convergence speed, the late time oscillation and easy falling local
features. The SVM model [1] is a machine learning method based on minimum value in this paper. And a chaotic particle swarm optimiza-
the minimum structural risk principle for data classification by Vapnik. tion (CPSO) algorithm based on combining the COA and PSO is pro-
posed. In LS-SVM model, the regularization parameter γ and radial
⁎ Corresponding author. Tel.:+86 571 8755 7136. basis kernel width parameter σ are very important for the optimization
E-mail address: puigpuig2010@gmail.com (F. Liu). performance. It is an open problem in the field of LS-SVM how to find
http://dx.doi.org/10.1016/j.chemolab.2015.08.015
0169-7439/© 2015 Elsevier B.V. All rights reserved.
148 F. Liu, Z. Zhou / Chemometrics and Intelligent Laboratory Systems 147 (2015) 147–156
the optimal values of regularization parameter γ and radial basis kernel has its own defect in optimizing parameters of SVM model and LS-
width parameter σ. The common selection method is the cross valida- SVM model, such as the low classification accuracy, weak generalization
tion method, but this method will not only consume a lot of computing ability, slow convergence speed, and so on. So the CPSO algorithm based
time, but also take on certain blindness. So the proposed CPSO algorithm on COA and PSO is proposed to select and optimize the parameters of
with global optimization ability is used to optimize the parameters of LS-SVM model in order to improve the classification accuracy, learning
LS-SVM model. This can not only overcome the time-consuming and performance and generalization ability.
blindness of the cross validation method, but also reflect small sample
learning ability, so as to improve the learning performance, generaliza- 3. Basic methods
tion ability and robustness. Finally, a new data classification method
based on the CPSO algorithm and LS-SVM model (CPL-SVM) is proposed 3.1. Chaotic optimization algorithm (COA)
in this paper. The binary classification data, IRIS flower data and three
relevant data sets with pharmacodynamic properties of drug are select- Chaos often exists in the nonlinear system. It is a kind of characteris-
ed to verify the effectiveness of the proposed CPL-SVM method. tic that has a bounded unstable dynamic behavior and exhibits sensitive
The rest of this paper is organized as follows. Section 2 briefly intro- dependence on the initial conditions. Chaotic optimization algorithm
duces the related works about SVM, LS-SVM and their improved (COA) [22] is a population-based stochastic optimization algorithm by
methods in the classification. Section 3 briefly introduces the related using the chaotic mapping. The basic procedure of the COA is divided
basic methods, including the COA, PSO algorithm, LS-SVM model and into two steps. First, the COA searches all the points in turn within the
diversity-guided mutation strategy. Section 4 presents a chaotic particle changing range of variables, and selects the better point as the current
swarm optimization algorithm, named the CPSO algorithm. Section 5 optimum point by using chaotic ergodicity, regularity, initial sensitivity
presents a novel data classification (CPL-SVM) method. In this section, and topological transitivity. Then the current optimum point is regarded
the thoughts, model and the steps of the CPL-SVM method are intro- as the center, a tiny chaotic disturbance is imposed and a careful search
duced in detail. Section 6 applies and analyzes the CPL-SVM method is performed in order to find out the global optimum point with the
in solving data classification problem. Finally, the conclusions are higher probability. Due to the chaotic non-repetition, the COA can
discussed in Section 7. carry out the overall search with the higher speed. So the COA takes
on the characteristics of the easy implementation, short execution
2. Related works time and robust mechanism.
Currently, there have been several kinds of the COA based on chaotic
In recent years, in allusion to the optimization parameters of the characteristics, such as adaptive mutative scale COA [23], a mutative
SVM model or LS-SVM model, many researchers have deeply studied scale COA [24], chaotic harmony search algorithm [25], multi-objective
and explored from the different views in optimizing these parameters. chaotic ant swarm optimization [26] and so on. Because the adaptive
They have proposed some optimization methods of parameters of the mutative scale COA has the refined search space, better search speed
SVM model or LS-SVM model, such as empirical selection method, gra- and higher search accuracy [23], it is used to optimize the particle
dient descent method, cross validation method, GA, PSO algorithm and swarm optimization (PSO) algorithm in this paper. Generally, the main
so on [4–8]. Temkoa et al. [9] proposed a fuzzy integral based combining problem of the COA is to obtain chaotic variables. So the Logistic chaotic
different information sources to classify a small set of highly confusable model is used to generate the chaotic variable. The mapping equation of
human non-speech sounds. Devos et al. [10] proposed a methodological the Logistic model is described:
approach to guide the optimization parameters of SVM based on a grid
search for minimizing the classification error rate. Tao et al. [11] Znþ1 ¼ Lðμ; Xn Þ ¼ μZn ð1−Xn Þμ ∈½0 ; 4; n ¼ 0 ; 1; 2; 3; ⋯ ð1Þ
proposed a new fast pruning algorithm for chemical pattern classifica-
tion. Ghorbanzad'e and Fatemi [12] proposed a classification method where control variable (μ ∈ [0, 4]) is the parameter of the Logistic. It has
of central nervous system agents by using LS-SVM based on their shown, when Zn ∈ [0, 1], the Logistic mapping is in the chaotic state. That
structural descriptors. Li et al. [13] proposed a novel automatic speaker is, the generated sequences under Logistic mapping function (the initial
age and gender identification approach based on combining seven condition Z0) are not periodic and converge. But the generated
different methods in order to improve the baseline performance. sequences must converge to one specific value outside the given range.
Huang et al. [14] proposed an informative novel tree kernel SVM classi-
fier to model the relationship between bioactivity and molecular 3.2. Particle swarm optimization (PSO)
descriptors. Dong and Luo [15] proposed a new method to achieve bear-
ing degradation classification based on principal component analysis The PSO algorithm [27] is a search algorithm based on simulating the
(PCA) and optimized LS-SVM method. Lou'i et al. [16] proposed two social behavior of birds within a flock. In the PSO algorithm, individuals,
new multisensor data fusion algorithms to reduce the rate of false referred to as particles, are “flown” through hyper dimensional search
detection and obtain reliable decisions on the presence of target objects. space. The positions of particles within the search space are changed
Zhang [17] proposed an improved data classification method based on based on the social-psychological tendency of individuals in order to
SVM applying rational sample data selection and GA-controlled training delete the success of other individuals. The changing of particle within
parameters optimization. Yao and Yi [18] proposed a new License Plate the population is influenced by the experience, or knowledge. The
(LP) detection technique based on multistage information fusion. Sung consequence of modeling for the social behavior is that the search is
and Chung [19] proposed a distributed energy monitoring network processed in order to return toward previously successful regions in
system based on data fusion via improved PSO algorithm. He et al. the search space. Namely, the velocity (v) and position (x) of each
[20] proposed a new method for classifying electronic nose data in particle will be changed according to the following expressions:
rats wound infection detection based on SVM and wavelet analysis.
Subhajit et al. [21] proposed a PSO method along with adaptive K- vi j ðt þ 1Þ ¼ wvi j ðtÞ þ c1 r1 pBi j ðtÞ−xi j ðtÞ þ c2 r2 gBi j ðtÞ−xi j ðtÞ ð2Þ
nearest neighborhood based gene selection technique to distinguish a
small subset of useful genes. xi j ðt þ 1Þ ¼ xi j ðtÞ þ vi j ðt þ 1Þ ð3Þ
For the optimization parameters of SVM model and LS-SVM model,
although these scholars have done the in-depth study and discussion where vij(t + 1) is the velocity of particle ith at iteration jth, xij(t + 1) is
by using the various optimization methods from the different angle the position of particle ith at iteration jth. w is inertia weight to be
degree in order to obtain some good results, each proposed method employed to control the impact of the velocity of previous history.
F. Liu, Z. Zhou / Chemometrics and Intelligent Laboratory Systems 147 (2015) 147–156 149
t denotes the iteration number, c1 is the cognition learning factor, c2 is process. So the diversity mutation strategy is introduced into the PSO
the social learning factor, r1 and r2 are random numbers in [0,1] for algorithm. The diversity is calculated by the following equation:
denoting remembrance ability. Generally, the value of each component vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
in V can be clamped to the range [−Vmax, Vmax] for controlling excessive m u
X uX n 2
t 1 t
roaming of the particle outside the search space. The PSO algorithm is d ¼ xti j −xtj ð10Þ
M jAj i¼1 j¼1
terminated with a maximal generations or the best position of particle
in the population, it cannot be further improved after a large number
m
of generations. So the PSO algorithm has shown its robustness and where xtj ¼ M
1
∑ xti j ; jAj is the length of the longest diagonal in the
efficacy in solving complex optimization problems. i¼1
search space. If dt is reduced to dmin (given value), the mutation opera-
3.3. Least square-support vector machine (LS-SVM) tion is performed by the following equation:
Support vector machine (SVM) [1] is one of the popular tools in a ztj ¼ ytj þ λ jAj ε ε Nð0 ; 1Þ ð11Þ
supervised machine learning method, it is based on structural risk
minimization. The LS-SVM model is to use the least square linear system Set ytj = ztj , ytg,j = ztj . The ytg,j represents the individual optimal loca-
as the loss function, and the inequality constraints are revised as the tion of particle. N(0, 1) represents the standard normal distribution. λ
equality constraints in the LS-SVM model. is the user-specified parameter (λ N dmin).
The given training sample is S = {(xi, yi)|i = 1, 2, 3, ⋯, m}, m is the When the mutation operation is performed, the particle offset of the
number of samples, the set {xi} ∈ Rn represents the input vector, global best position will increase the value of the |ytj − ytij|. And the
y ∈ {− 1, 1} indicates the corresponding desired output vector, the average optimal position will pull away from its original position in
input data is mapped into the high dimensional feature space by using order to extend the search range of particle. The value of dt will increase
nonlinear mapping function ϕ(•). Then the existed optimal classification in the each time.
hyperplane must meet the following conditions:
3.5. Multi-population search strategy
ωT xi þ b≥1; yi ¼ 1
ð4Þ
ωT xi þ b≤−1; yi ¼ −1 The classical PSO algorithm and improved PSO algorithm in the
search always pursue the global optimum, this will cause falling into
where ω is omega vector of superplane, b is offset quantity. Then the local minima value. Multi-population search strategy is proposed to
classification decision function is described as follow: improve PSO algorithm. The idea is to divide the whole population
into several subpopulations, then each subpopulation represents a sub-
goal of problem. All subpopulations are coevolved based on information
f ðxi Þ ¼ sgn ωT xi þ b : ð5Þ transfer and knowledge sharing. This strategy uses the independent
search of particles to ensure the optimization searching process, it can
be carried out in a large range in the search space. And this strategy
The classification model of LS-SVM is described by the optimization
can chase the global optimum in order to ensure the convergence of
function min Jðω; ξi Þ:
ω;ξ;b PSO algorithm, so as to balance the accuracy and efficiency of optimiza-
tion process.
1 T 1 Xm
2
min Jðω; ξi Þ ¼ ω ωþ γ ξ ð6Þ
ω;ξ;b 2 2 i¼1 i 4. Select kernel function and propose chaotic particle swarm
optimization (CPSO) algorithm
s:t: yi ωT ϕðxi Þ þ b ¼ 1−ξi ; i ¼ 1; 2; 3; ⋯; m ð7Þ
4.1. The selection of kernel function
where ξi is slack variable, b is offset, ω is support vector, ξ =
Although the kernel function can effectively use nonlinear transfor-
(ξ1, ξ2, ⋯, ξm), γ is classification parameter to balance the fitness error
mation to solve the inner product calculation problem in the feature
and model complexity.
space, it brings a difficult problem how to effectively select kernel
The optimization problem is transformed into its dual space. La-
function problem. Because the mapping function (ϕ(xi)) is recessive,
grange function is introduced to solve it. The corresponding optimiza-
VC dimensions of feature space are unknown. According to the structur-
tion problem of LS-SVM model with Lagrange function is described as
al risk minimization principle, the VC dimensions of the given function
follow:
are the most important index for evaluating the generalization perfor-
1 T 1 Xm X m mance of machine learning. In this case, the kernel matrix is taken as
2
Lðω; b; ξ; αÞ ¼ ω ωþ γ ξi − αi yi ωT ϕðxk Þ þ b −1 þ ξi ð8Þ the only link method between the input and the learning algorithm.
2 2 i¼1 k¼1 The learning algorithm can receive the information characteristics of
data by using kernel matrix. So the correlative component of kernel
where αi is the Lagrange multiplier, and αi ≥ 0(i = 1, 2, 3. ⋯, m). Then
matrix is used to analyze the LS-SVM model and evaluate the generali-
the classification decision function is described as follow:
zation performance of learning system in this paper.
! However, the selection of different kernel function can construct
X
m
f ðxi Þ ¼ sgn αi yi Kðx; xi Þ þ b ð9Þ LS-SVM model with different performance, so it is very important to
i¼1 select the kernel function for constructing good performance of LS-SVM
model. There are several common kernel functions, such as linear kernel
function, polynomial kernel function, radial basis kernel function (RBF),
3.4. Diversity-guided mutation strategy Sigmoid kernel function, Fourier kernel function and so on. However,
there has not exists a best kernel function for all applications. In this
Although the PSO algorithm can effectively solve the global optimi- paper, the LS-SVM model is used to classify the complex data. Due to
zation problems, but it cannot avoid the premature convergence the RBF with good learning ability, simple form, symmetry radial,
problem. This will result in declining population diversity in the search good smoothness and analyticity, the RBF is widely applied to classify
150 F. Liu, Z. Zhou / Chemometrics and Intelligent Laboratory Systems 147 (2015) 147–156
Table 1
Initialize parameters of the PSO algorithm The parameters of these algorithms.
N
Meet stopping condition ?
parameter γ are two important parameters in LS-SVM model. So the se-
lected values directly influence the learning ability and generalization
Y
performance of LS-SVM.
Obtain the optimal solution
4.2. Chaotic particle swarm optimization (CPSO) algorithm
Yes Yes
Global optimal position
is updated?
Optimize parameters of LS-SVM
No
Is the renewed number No
Calculate the mean square error, maximum?
output the optimal parameters
Yes
Obtain the optimal LS-SVM model Jump out of local optimum by
(CPLSVM) using chaotic ergodicity
Table 2 mainly based on the sacrifice and memory property of PSO algorithm
The obtained optimal values of parameters. in order to make global exploring in the first stage. And the COA is
Optimization method σ γ MSE applied to search better particle around the global best particles in
Cross validation 388.93 0.3604 4.61 × 10−3
scanning in the second stage. So the flow of the CPSO algorithm is
GA 364.16 0.2837 7.93 × 10−4 shown in Fig. 1.
PSO 345.76 0.239 4.22 × 10−4 As shown in Fig. 1, the circle of the proposed CPSO algorithm will be
APSO 342.24 0.226 8.45 × 10−5 terminated, while the critical condition of the setting maximal iteration
CPSO 336.15 0.195 5.23 × 10−5
is achieved.
Points K = 31
1
Points K = 31 1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0 0
-0.2 -0.2
-0.4 -0.4
-0.6 -0.6
-0.8 -0.8
-1 -1
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
1 Points K = 54 Points K = 54
1
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
-0.2 -0.2
-0.4 -0.4
-0.6 -0.6
-0.8 -0.8
-1 -1
-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
of the Mth subpopulation. So the complete vector function (P(⋅)) is 5.2. The optimizing model of the CPL-SVM method
described as follow:
The optimizing model of LS-SVM based on CPSO algorithm is shown
in Fig. 2.
PðM; SM yi Þ; f ðPðM; SM xi ÞÞ ≥ f ðPðM; SM yi ÞÞ
PðM; SM yi Þ ¼ ð14Þ 5.3. The optimizing steps of the CPL-SVM method
PðM; SM xi Þ; f ðPðM; SM xi ÞÞ b f ðPðM; SM yi ÞÞ
PðM; ZÞ ¼ ðS1 y; S2 y; S3 y; ⋯; SM‐1 y; Z; Smþ1 y; ⋯Sk y;Þ ð15Þ Step 1 Initialize parameters
These parameters are initialized, including the population size
where Z = SMxi denotes the current position of the ith particle in the Mth N, the number of subpopulation M, maximum number of iter-
subpopulation. Z = SMyi denotes the best position of the ith particle in ation Tmax, current iteration t = 1, learning factor [cmin, cmax],
the Mth subpopulation. Z ¼ SM yi denotes the best position of the Mth inertia weight [wmin, wmax], velocity of particle [Vmin, Vmax],
subpopulation. position of particle [Pdmin, Pmax
d
], d = 1, 2, 3, ⋯, D. c1, c2, r1, r2
The Eq. (14) represents to update the optimal position of the Mth and w are generated.
subpopulation under the other unchanged subpopulations. That is to Step 2 Select the fitness function
say, if the current position fitness value of particle is smaller than the The fitness function is used to evaluate the performance of
fitness value of individual position, then the individual position is CPL-SVM method. The fitness function (13) is used to calculate
equal to the current position. Otherwise, remain unchanged. the fitness value of each particle.
The updating equation of optimal position for all subpopulations is Step 3 Initialize a vector Zdi (0)(d = 1, 2, 3, ⋯, D), each component is
described as follow: set the range (0,1). Generate chaotic queues Zdi (t)(i =
1, 2, 3, ⋯, N) by the iteration of Logistic model, which is de-
scribed as Zdi (t) = 4 × Zdi (t − 1) × (1 − Zdi (t − 1)), i =
PðM; SM yi Þ ¼ arg min f ðPðM; SM yi ÞÞ 1 ≤ i ≤ s; 1 ≤ k ≤ M: ð16Þ 1, 2, 3, ⋯, S.
PðM;SM yi Þ Step 4 Transform the chaotic queues into the range of parameters of
Points K = 46 1 Points K = 46
1
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
-0.2 -0.2
-0.4 -0.4
-0.6 -0.6
-0.8 -0.8
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
Table 3 Table 4
The classification results for IRIS data set. The relevant data sets of SAR.
Method Classification Samples Correct Accuracy Average Data set Compounds Class+ Class−
number accuracy
HIA 196 131 65
CVL-SVM Setosa 30 28 93.3% 86.3% P-gp 201 116 85
Versicolor 30 26 86.7% TdP 361 85 276
Virginica 30 25 83.3%
GL-SVM Setosa 30 29 96.7% 90%
Versicolor 30 27 90%
Virginica 30 25 83.3%
PL-SVM Setosa 30 30 100% 93%
Versicolor 30 27 90%
Virginica 30 26 86.7%
APL-SVM Setosa 30 30 100% 97%
Versicolor 30 29 96.7% relevant data sets with the pharmacodynamic properties of drug
Virginica 30 28 93.3% are selected in this paper. At the same time, the cross validation
CPL-SVM Setosa 30 30 100% 99% method [31], genetic algorithm (GA) [32], particle swarm optimiza-
Versicolor 30 30 100%
tion (PSO)algorithm [33] and adaptive particle swarm optimization
Virginica 30 29 96.7%
(APSO) [34] algorithm are selected to compare with the proposed
CPSO algorithm. The experiment works on Intel(R) Core i5-4200U,
LS-SVM model according to Pdi (t) = Pdmin + (Pdmax − Pdmin)Zdi (t). 2.40GHz, 4G RAM, Windows 7 and MATLAB 2010b. For the parame-
Step 5 Calculate and compare the fitness values ters of these algorithms, we started with some classic values that
If the current fitness value of particle Pi b Pibest (the history have already been used in other studies papers, and then we modi-
optimal fitness value of particle), then Pi is replaced by Pibest. fied these values until the selected values are chosen. The selected
If Pibest b Pgbest (the optimal fitness value of population), then ones are those that gave the best computational results concerning
Pibest is replaced by Pgbest. both the quality of solution and the run time needed to achieve this
Step 6 Obtain the individual best Pdibest and global Gdibest. solution. The parameters of these algorithms are given in Table 1.
Step 7 If the convergence criteria or the stopping criteria (Generally, a
sufficiently good fitness value or maximum iteration is met) is
satisfied. Go to Step 11. 6.1. Analyze the optimization performance of the CPSO algorithm
Step 8 Update the velocity Vi and position Pi of each particle. At the
same time, c1, c2, r1, r2 and w are obtained. In order to verify the optimization performance of CPSO algorithm in
Step 9 Compare the fitness value of each particle with its individual searching the optimal values of parameters σ and γ, the cross validation
best Pdibest, if the current value is better than Pdibest,then update method, genetic algorithm (GA), PSO algorithm, and adaptive particle
Pdibest as current position. At the same time, the fitness value of swarm optimization (APSO) algorithm are selected to compare their
each particle is compared with the global best Gdibest, if current optimized performances. The values of parameters of the CPSO, cross
value is better than Gdibest, then update Gdibest as current position. validation, GA, PSO and APSO algorithms are shown in Table 1. Then
Step 10 Determine the end condition. If the end condition is met, the the CPSO, cross validation, GA, PSO and APSO algorithms are used to
searching process is end and return to the result of current search the optimal values of parameters σ and γ, the obtained optimal
best individual. Otherwise, return to Step 5 to recalculate values of parameters σ and γ are shown in Table 2.
until the termination condition is met or the number of The optimized results of parameters by using the cross validation
iteration Tmax is achieved. method, PSO algorithm, APSO algorithm and CPSO algorithm are
Step 11 The obtained optimal position is the values of parameters γ shown in Table 2. As can be seen in Table 2, the mean squared error
and σ of LS-SVM model. (MSE) of CPSO algorithm is 5.23 × 10−5, the MSE of cross validation
Step 12 Obtain the optimized LS-SVM (CPL-SVM) model. method is 4.61 × 10−3, the MSE of GA is 7.93 × 10−4, the MSE of the
PSO algorithm is 4.22 × 10−4, and the MSE of APSO algorithm is
6. Numerical experiments and discussions 8.45 × 10−5. So the MSE of CPSO algorithm is smaller than the cross val-
idation method, GA, PSO algorithm and APSO algorithm. The results
In order to verify the effectiveness of CPSO algorithm and CPL-SVM show that the CPSO algorithm takes on the better optimization ability
method, the binary classification data, IRIS flower data and three in searching the optimal values of parameters σ and γ. Then the optimal
120%
100.0%
100.0% 99.0%
100.0% 100.0%
100% 96.7% 96.7% 96.7% 97.0%
90.0% 93.3% 93.0% CVL-SVM
93.3%
90.0% 90.0% GL-SVM
86.7% 86.7% 88.9%
83.3% PL-SVM
83.3%
APL-SVM
80%
CPL-SVM
60%
Setosa Versicolor Virginica Average accuracy
Fig. 6. The comparisons of the CVL-SVM, GL-SVM, PL-SVM, APL-SVM and CPL-SVM methods.
154 F. Liu, Z. Zhou / Chemometrics and Intelligent Laboratory Systems 147 (2015) 147–156
Table 5 IRIS flower data is a classic pattern recognition problem. There have
The classification results for the relevant data sets. three kinds of Setosa flower, Versicolor flower and Virginica flower.
Data Method TP TN FP FN SE SP PR MCC ACC Each kind of flower has formed the sample with four dimensions
set (%) (%) (%) (%) (%) according to the length and width of the petal and calyx. Suppose
HIA SVM 108 40 25 23 82.44 61.54 81.20 44.34 75.51 three kinds of flowers generate 150 samples that are divided into two
LS-SVM 111 42 23 20 84.73 64.62 82.84 49.96 78.06 parts. The front 20 samples are regarded as the training sample set
CVL-SVM 110 42 24 18 85.93 63.64 82.09 50.81 78.35 and the behind 30 samples are regarded as the testing sample set in
PL-SVM 115 40 25 16 87.79 61.54 82.14 51.40 79.08
each part. Experiments are performed independently 20 times, and
APL-SVM 114 41 24 18 86.37 63.08 82.61 51.75 78.68
CPL-SVM 113 43 22 18 86.26 66.15 83.70 58.07 79.59 the average values are regarded as the final classification results. The
P-gp SVM 94 52 33 22 81.03 61.68 74.02 43.24 72.64 IRIS flower data classification results are shown in Table 3. The compar-
LS-SVM 95 53 32 21 81.90 62.35 74.80 45.33 73.63 isons of the CVL-SVM, PL-SVM, APL-SVM and CPL-SVM methods are
CVL-SVM 100 54 31 17 85.47 63.53 76.34 53.94 76.24 shown in Fig. 6.
PL-SVM 103 55 30 13 88.79 64.71 77.44 56.21 79.10
APL-SVM 102 60 26 14 87.93 69.77 79.69 59.22 80.20
As can be seen in Table 3 and Fig. 6, under the same number of
CPL-SVM 102 65 20 14 87.93 76.47 83.61 65.14 83.08 training samples, for Setosa, the classification accuracies of CVL-SVM
TdP SVM 60 232 44 25 70.59 84.06 57.69 51.20 80.89 and GL-SVM methods are respectively 93.3% and 96.7%, and the classifi-
LS-SVM 66 236 40 21 77.65 85.51 62.26 57.69 83.20 cation accuracies of PL-SVM, APL-SVM and CPL-SVM methods are 100%.
CVL-SVM 67 240 35 19 79.91 87.27 65.69 62.39 85.04
For Versicolor, the classification accuracies of the CVL-SVM, GL-SVM, PL-
PL-SVM 69 246 30 16 81.18 89.13 69.70 66.87 86.78
APL-SVM 70 246 29 15 82.35 89.45 70.71 67.83 87.53 SVM, APL-SVM and CPL-SVM methods are respectively 86.7%, 90%, 90%,
CPL-SVM 72 248 28 13 84.71 89.86 72.00 70.69 88.15 96.7% and 100%. For Virginica, the classification accuracies of CVL-SVM,
GL-SVM, PL-SVM, APL-SVM and CPL-SVM methods are respectively
83.3%, 83.3%, 86.7%, 93.3% and 96.7%. The average classification accura-
values of parameters σ and γ in Table 2 are set in the LS-SVM model in cies of the CVL-SVM, GL-SVM, CVL-SVM, PL-SVM, APL-SVM and CPL-
order to obtain the cross validation LS-SVM (CVL-SVM), GA-LS-SVM SVM methods are respectively 86.3%, 90%, 93%, 97% and 99%. Therefore,
(GL-SVM), PSO-LS-SVM (PL-SVM) model, APSO-LS-SVM (APL-SVM) the classification accuracy of CPL-SVM method is better than the CVL-
model and CPL-SVM model. SVM, GL-SVM, PL-SVM and APL-SVM methods. The experiment results
show that the proposed CPL-SVM method takes on the strong generali-
6.2. Binary classification zation ability, and can effectively avoid the isolated effects of sample in
the active learning process.
In order to verify the binary classification ability of the proposed
CPL-SVM method, under the linear separable and nonlinear separable
conditions, binary classification data sets (existing data) are selected 6.4. The drug data classification
to classify in this paper. For the linear separable condition, the classifica-
tion results are obtained and shown in Figs. 3 and 4. For the nonlinear In order to evaluate the performance of the proposed CPLSVM
separable condition, the classification result is obtained and shown in method, three relevant data sets with pharmacodynamic properties of
Fig. 5. drug are selected in here [37,38]. Human intestinal absorption (HIA)
As can be seen in Figs. 3, 4, and 5, with the increase of the number of data set has 131 absorbable HIA+ compounds and 65 non-absorbable
training data, the number of support vector and the training time also HIA − compounds. The drug is absorbed more than 70%, the drug is
increase. And the support vector is only a small part of the total training considered to be absorbable. P-glycoproteins (P-gp) data set describes
samples. In order to achieve the best effect of standard LS-SVM, the many anti-cancer drugs, which are transported out of cells by
appropriate number of training samples should be selected in order to transmembrane protein. It will result in an ability of inhibiting the
achieve the best effect of classification training time and improve the chemical treatment. The increasing of P-gp represents multi drug-
correct rate of classification. resistant. P-gp data set has 116 mediums and 85 non-mediums. TdP
data set is a potentially fatal polymorphic ventricular tachycardia dis-
6.3. The IRIS flower data classification ease. TdP data set has 85 inducers (TdP +) and 276 non-inducers
(TdP−). The 159 descriptors are used to classify three data sets. There
In order to verify the classification effectiveness of the proposed CPL- include 13 quantum chemical properties, 16 geometric properties, 18
SVM method, the CVL-SVM (cross validation LS-SVM), the GL-SVM (GA- simple molecule properties, 28 molecule connect and shape descriptors,
LS-SVM), the PL-SVM (PSO-LS-SVM) [35] model, APL-SVM (APSO-LS- and 84 electric topological descriptors. The brief descriptions of three
SVM) [36] model and IRIS flower data set are selected in this section. data sets are shown in Table 4.
100%
87.5%
90% 86.8% 88.2%
85.0% SVM
83.1% 83.2% LS-SVM
79.6% 80.2%
78.4% 78.7% 79.1% 80.9% CVL-SVM
80% 78.1% 79.1% 76.2%
75.5% PL-SVM
73.6%
72.6% APL-SVM
CPL-SVM
70%
60%
HIA P-gp TdP
Fig. 7. The classification comparisons of the SVM, LS-SVM, CVL-SVM, PL-SVM and APL-SVM and CPL-SVM methods.
F. Liu, Z. Zhou / Chemometrics and Intelligent Laboratory Systems 147 (2015) 147–156 155
There have some indexes of data classification and evaluation. These of the cross validation method, but also reflect small sample learning
indexes include true positives (TP), true negatives (TN), false positives ability of LS-SVM, so as to improve the learning performance, generali-
(FP), false negatives (FN), sensitivity (SE), specificity (SP), the overall zation ability and robustness of LS-SVM. The binary classification data,
classification accuracy (ACC), precision (PR), and Matthews correlation IRIS flower data and three relevant data sets with pharmacodynamic
coefficient (MCC). These indexes are used to evaluate the performance properties of drug are used to verify the effectiveness of the proposed
of the CPL-SVM method in classifying the drug data. These data sets CPL-SVM method by MATLAB 2010b. The simulation results show that
are averagely divided into five data subsets. Four data subsets are select- the CPSO algorithm can effectively optimize the parameters of LS-SVM
ed to train the CPL-SVM model. The other data subset is used to calculate model, and the proposed CPL-SVM method can obtain an equivalently
the error. This process is repeated independently 20 times in order to classification ability by comparing the SVM, LS-SVM, CVL-SVM, PL-
obtain the effective classification result for each part. The CPL-SVM SVM, APL-SVM and CPL-SVM methods for the chemical data. The CPL-
method is compared with the SVM, LS-SVM, CVL-SVM, PL-SVM and SVM algorithm can effectively improve the recognition rate and speed
APL-SVM methods. The classification results for the relevant data sets up the recognition speed.
are shown in Table 5.
As can be seen in Table 5 and Fig. 7, there are classification compar- Conflict of interest
isons for HIA, P-gp and TdP data by using the SVM, LS-SVM, CVL-SVM,
PL-SVM, APL-SVM and CPL-SVM methods. The indexes of the TP, TN, The authors confirm that this article content has no conflict of
FP, FN, SE, SP, ACC, PR, and MCC are selected to evaluate the perfor- interest.
mance of CPL-SVM method. In the three data sets, the data exist in
non-linear structure. At the same time, the CPL-SVM method obtains Acknowledgments
the best classification accuracy (79.59%), the sensitivity (86.26%),
specificity (66.15%), precision (83.70%), and Matthews correlation The authors would like to thank all the reviewers for their construc-
coefficient (58.07%). For P-gp data, the CPL-SVM method obtains the tive comments. This research was supported by the National Natural Sci-
best classification accuracy (83.08%), the specificity (76.47%), precision ence Foundation of China (61303133, U1433124, 51475065), Zhejiang
(83.61%), and Matthews correlation coefficient (65.14%). For TdP data, Provincial Natural Science Foundation of China (LQ14F020007), Open
the CPL-SVM method obtains the best sensitivity (84.71%), specificity Project Program of Guangxi Key laboratory of hybrid computation and
(89.86%), precision (72.00%), and Matthews correlation coefficient IC design analysis (HCIC201402), the Open Project Program of Provincial
(70.69%). The above mentioned aspects are best results in the consid- Key Laboratory for Computer Information Processing Technology,
ered methods. The sensitivity of PL-SVM algorithm is better than the Soochow University (KJS1326), the Open Project Program of State
CPL-SVM algorithm for HIA and P-gp data. Key Laboratory of Mechanical Transmissions (Chongqing University)
In general, for the HIA, P-gp and TdP data sets, the proposed CPL- (SKLMT-KFKT-201416), the Open Project Program of the Traction
SVM algorithm can obtain the best sensitivity, specificity, classification Power State Key Laboratory of Southwest Jiaotong University
precision, Matthews correlation coefficient and classification accuracy. (TPL1403). The program for the initialization, study, training, and
simulation of the proposed algorithm in this article was written with
6.5. The effective analysis of the CPL-SVM method the tool-box of MATLAB 2010 produced by the Math-Works, Inc.
According to the classification results of the binary data, IRIS flower References
data and drug data, the COA with the ergodicity, randomicity and
regularity can effectively add the chaotic scrambling thoughts to the [1] V.N. Vapnik, The Nature of Statistical Learning Theory, Springer, NewYork, 1995.
[2] J.A.K. Suykens, J. Vandewall, Least squares support vector machine classifiers,
initial position and the optimum position of particle, and improve the Neural. Process. Lett. 9 (3) (1999) 293–300.
global searching ability. And the CPSO algorithm can optimize the pa- [3] K. Pelckmans, J.A.K. Suykens, B. De Moor, Building sparse representations and struc-
rameters of LS-SVM model to propose a optimized LS-SVM(CPL-SVM) ture determination on LS-SVM substrates, Neurocomputing 64 (3) (2005) 137–159.
[4] W. Deng, R. Chen, J. Gao, et al., A novel parallel hybrid intelligence optimization
model. This can not only overcome the time-consuming and blindness
algorithm for a function approximation problem, Comp. Math. Appl. 63 (1)
of the cross validation method, but also reflect small sample learning (2012) 325–336.
ability of LS-SVM, so as to improve the learning performance, generali- [5] Y. Bazi, F. Melgani, Semisupervised PSO-SVM regression for biophysical parameter
estimation, IEEE Trans. Geosci. Remote Sens. 45 (6) (2007) 1887–1895.
zation ability and robustness of LS-SVM. The binary classification data,
[6] M.W. Mustaf, M.H. Sulaiman, et al., An application of genetic algorithm and least
IRIS flower data and three relevant data sets with pharmacodynamic squares support vector machine for tracing the transmission loss in deregulated
properties of drug are used to verify the effectiveness of the proposed power system, IEEE Power Engineering and Optimization Conference (PEOCO)
CPL-SVM method. As can be seen in Tables 3, 4, 5, Figs. 3, 4, 5, 6 and 2011, pp. 375–380.
[7] W. Deng, H.M. Zhao, J.J. Liu, et al., An improved CACO algorithm based on adaptive
Fig. 7, the CPL-SVM method can obtain the better classification accuracy method and multi-variant strategies, Soft. Comput. 19 (3) (2015) 701–713.
than the SVM, LS-SVM, CVL-SVM, PSO-SVM and PL-SVM methods. The [8] W. Deng, R. Chen, B. He, et al., A novel two-stage hybrid swarm intelligence optimi-
CPL-SVM method takes on the strong generalization ability, the best zation algorithm and application, Soft. Comput. 16 (10) (2012) 1707–1722.
[9] A. Temko, A. Vitae, D. Macho, Fuzzy integral based information fusion for classification
sensitivity, classification precision, Matthews correlation coefficient, of highly confusable non-speech sounds, Pattern Recogn. 41 (5) (2008) 1814–1823.
and classification accuracy. At the same time, the CPL-SVM method [10] O. Devos, C. Ruckebusch, A. Durand, L.Duponchel, J.P. Huvenne. Support vector
can effectively avoid the isolated effects of sample in the active learning machines (SVM) in near infrared (NIR) spectroscopy: focus on parameters opti-
mization and model interpretation. Chemometrics and Intelligent Laboratory
process. Systems,96(1), PP.27-33
[11] S.H. Tao, D.Z. Chen, W.X. Zhao, Fast pruning algorithm for multi-output LS-SVM and
7. Conclusion its application in chemical pattern classification, Chemom. Intell. Lab. Syst. 96 (1)
(2009) 63–69.
[12] M. Ghorbanzad'e, M.H. Fatemi, A classification of central nervous system agents by
Chemometrics has been defined as the chemical discipline. The LS- least squares support vector machine based on their structural descriptors: a
SVM is a simple method to realize the pattern recognition. The optimal comparative study, Chemom. Intell. Lab. Syst. 110 (1) (2012) 102–107.
[13] M. Li, K.J. Han, S. Narayanan, Automatic speaker age and gender recognition using
hyperplane is solved according to the initial sample in order to deter-
acoustic and prosodic level information fusion, Comput. Speech Lang. 27 (1)
mine the decision function and identify other unknown samples. An (2013) 151–167.
improved PSO (CPSO) algorithm based on COA and PSO is proposed in [14] X. Huang, D.S. Cao, Q.S. Xu, L. Shen, J.H. Huang, Y.Z. Liang, A novel tree kernel support
this paper. Then the CPSO algorithm is used to optimize parameters of vector machine classifier for modeling the relationship between bioactivity and
molecular descriptors, Chemom. Intell. Lab. Syst. 120 (2013) 71–76.
LS-SVM model with RBF in order to obtain a novel data classification [15] S.J. Dong, T.H. Luo, Bearing degradation process prediction based on the PCA and
method. This can not only overcome the time-consuming and blindness optimized LS-SVM model, Measurement 46 (9) (2013) 3143–3152.
156 F. Liu, Z. Zhou / Chemometrics and Intelligent Laboratory Systems 147 (2015) 147–156
[16] A.S. Lou'i, M.S. Saadawia, S. Dirk, Improved process monitoring and supervision [29] M. Vanny, K.E. Ko, S.M. Park, K.B. Sim, Physiological responses-based emotion recog-
based on a reliable multi-stage feature-based pattern recognition technique, Inf. nition using multi-class SVM with RBF Kernel, J. Inst. Control Robot. Syst. 19 (4)
Sci. 259 (2014) 282–294. (2013) 364–371.
[17] Y.X. Zhang, An improved QSPR method based on support vector machine applying [30] Y.F. Sun, P.C. Zhang, J. Qi, A cognitive networks state prediction algorithm based on
rational sample data selection and genetic algorithm-controlled training parameters SVM improved by weighted RBF kernel function, Adv. Inf. Sci. Serv. Sci. 4 (2) (2012)
optimization, Chemom. Intell. Lab. Syst. 134 (2014) 34–46. 81–88.
[18] Z.J. Yao, W.D. Yi, License plate detection based on multistage information fusion, Inf. [31] G.C. Cawley, N.L.C. Talbot, Fast exact leave-one-out cross-validation of sparse least-
Fusion 18 (2014) 78–85. squares support vector machines, Neural Netw. 17 (10) (2004) 1467–1475.
[19] W.T. Sung, H.Y. Chung, A distributed energy monitoring network system based on [32] C.G. Fei, Z.Z. Han, Q.K. Liu, Ultrasonic flaw classification of seafloor petroleum
data fusion via improved PSO, Measurement 55 (2014) 362–374. transporting pipeline based on chaotic genetic algorithm and SVM, J. Xray Sci.
[20] Q.H. He, J. Yan, Y. Shen, et al., Classification of electronic nose data in wound infection Technol. 14 (1) (2006) 1–9.
detection based on PSO-SVM combined with wavelet transform, Intell. Autom. Soft [33] M.Q. Pan, D.H. Zeng, G. Xu, Temperature prediction of hydrogen producing reactor
Comput. 18 (7) (2012) 967–979. using SVM regression with PSO, J. Comp. 5 (3) (2010) 388–393.
[21] K. Subhajit, D.S. Kaushik, M. Madhubanti, Gene selection from microarray gene ex- [34] Y.S. Huang, J.J. Deng, Application of adaptive particle-swarm-optimized SVM to
pression data for classification of cancer subgroups employing PSO and adaptive short-term load forecasting based on grey error calibration, J. Comput. Inf. Syst. 6
K-nearest neighborhood technique, Expert Syst. Appl. 42 (1) (2015) 612–627. (3) (2010) 707–715.
[22] H.J. Lu, H.M. Zhang, L.H. Ma, A new optimization algorithm based on chaos, [35] C.A. Perez, C.M. Aravena, J.I. Vallejos, P.A. Estevez, C.M. Held, Face and iris localiza-
J. Zhejiang Univ. (Sci.A) 7 (4) (2006) 539–542. tion using templates designed by particle swarm optimization, Pattern Recogn.
[23] J.Q. E, C.H. Wang, Y.N. Wang, J.K. Gong, A new adaptive mutative scale chaos optimi- Lett. 31 (9) (2010) 857–868.
zation algorithm and its application, J. Contr. Theory Appl. 6 (2) (2008) 141–145. [36] X. Chen, J. Han, A novel classification approach based on support vector machine
[24] X.F. Yuan, X.H. Dai, L.H. Wu, A mutative-scale pseudo-parallel chaos optimization and adaptive particle swarm optimization algorithm, Proceedings - 2008 Interna-
algorithm, Soft. Comput. 19 (5) (2015) 1215–1227. tional Symposium on Knowledge Acquisition and Modeling, KAM, 2008 2008,
[25] Q.K. Pan, L. Wang, L. Gao, A chaotic harmony search algorithm for the flow shop pp. 703–707.
scheduling problem with limited buffers, Appl. Soft Comput. J. 11 (8) (2011) [37] Y. Xue, C.W. Yap, L.Z. Sun, Z.W. Cao, J.F. Wang, Y.Z. Chen, Prediction of p-glycoprotein
5270–5280. substrates by a support vector machine approach, J. Chem. Inf. Comput. Sci. 44
[26] J.J. Cai, X.Q. Ma, Q. Li, L.X. Li, H.P. Peng, A multi-objective chaotic ant swarm optimi- (2004) 1497–1505.
zation for environmental/economic dispatch, Int. J. Electr. Power Energy Syst. 32 (5) [38] S.Y. Yang, Q. Huang, L.L. Li, C.Y. Ma, H. Zhang, R. Bai, Q.Z. Teng, M.L. Xiang, Y.Q. Wei,
(2010) 337–344. An integrated scheme for feature selection and parameter setting in the support
[27] J. Kennedy, R. Eberhart, J. Kennedy, R. Eberhart, Particle swarm optimization, vector machine modeling and its application to the prediction of pharmacokinetic
Proceedings of the IEEE international conference on neural networks, IEEE Press, properties of drugs, Artif. Intell. Med. 46 (2) (2009) 155–163.
Piscataway 1995, pp. 1942–1948.
[28] B.C. Kuo, H.H. Ho, C.H. Li, C.C. Hung, J.S. Taur, A kernel-based feature selection meth-
od for SVM with RBF kernel for hyperspectral image classification, IEEE J. Sel. Top.
Appl. Earth Obs. Remote Sens. 7 (1) (2014) 317–326.