1-s2.0-S2214714419316678-main

Journal of Water Process Engineering 33 (2020) 101081
Contents lists available at ScienceDirect
Journal of Water Process Engineering

journal homepage: www.elsevier.com/locate/jwpe
Emerging evolutionary algorithm integrated with kernel principal T

component analysis for modeling the performance of a water treatment
plant
S.I. Abbaa,1, Quoc Bao Phamb,1, A.G. Usmanc, Nguyen Thi Thuy Linhb,d, D.S. Aliyue,
Quyen Nguyenf, Quang-Vu Bachf,*
a
Department of Physical Planning Development, Yusuf Maitama Sule University Kano, Nigeria
b
Department of Hydraulic and Ocean Engineering, National Cheng-Kung University, Tainan 701, Taiwan
c
Department of Analytical Chemistry, Faculty of Pharmacy, Near East University, Near East Boulevard 99138, Nicosia, North Cyprus, via Mersin 10, Turkey
d
Thuyloi University, 175 Tay Son, Dong Da, Hanoi, Vietnam
e
Department of Civil Engineering, Newcastle University, UK
f
Institute of Research and Development, Duy Tan University, Danang 550000, Vietnam
A R T I C LE I N FO A B S T R A C T
Keywords: Providing a robust and reliable model is essential for hydro-environmental and public health engineering per-
Principal component analysis spectives, including water treatment plants (WTPs). The current research develops an emerging evolutionary
Water treatment plant data-intelligence model: extreme learning machine (ELM) integrated with kernel principal component analysis
Data-driven algorithms (KPCA) to predict the performance of the Tamburawa WTP in Kano, Nigeria. A traditional feed-forward neural
Cross-validation
network (FFNN) and a classical linear autoregressive (AR) models were also employed to compare the predictive
Kano-Tamburawa
Extreme learning marchine
performance. For this purpose, different input data with the corresponding treated pH, turbidity, total dissolve
solids, and hardness as the target variables obtained from the WTP were used. The predictive models are
evaluated based on the three numerical indices, namely Nash-Sutcliffe (NC), root mean squared error (RMSE)
and mean absolute percentage error (MAPE). To examine the similarities and differences between the observed
and predicted values, a two-dimension graphical diagram (i.e., Taylor diagram) was also utilized. The predictive
results revealed the potential of KPCA-ELM, which exhibited a high level of accuracy in comparison to the single
models for all the considered variables with a slight exception in terms of pH prediction. Two different model
combination were built for each single (FFNN, ELM, and AR) model and KPCA algorithms (KPCA-FFNN, KPCA-
ELM, and KPCA-AR). The results also depicted that both ELM and FFNN models demonstrated prediction skill
and therefore, can serve as reliable models. The outcomes may contribute to the aforementioned modeling of the
treated parameters and provides a reference benchmark for wastewater management and control in the
Tamburawa WTP.
1. Introduction through several processes and technologies [3]. The United Nations
Educational, Scientific and Cultural Organization (UNESCO) reported
The rapid increase in population, urbanization, industrial and that WTPs are essential components for attaining sustainable develop-
agricultural water demands present threats to water treatment plants ment and are crucial for public and environmental health [3,4].
(WTPs) due to the capacity overload. According to the World Health Therefore, a satisfactory WTP is paramount to overcome the problems
Organization (WHO) and the United Nations International Children's of water scarcity and to meet the domestic water standards required by
Emergency Fund (UNICEF), water is one of the most indispensable law [6]. The physicochemical characteristics are often the major factors
factors need to sustain life, and affordable and adequate supply of water affecting the performance, operation, and control of WTPs [4,5]. The
must be available [1,2]. WTPs are operated to remove bacteria, solids, failure to control the physicochemical parameters prescribed standard
micro-organisms, and other contaminants from untreated water limit in any WTP may cause significant environmental and public
⁎
Corresponding author.
E-mail address: bachquangvu@duytan.edu.vn (Q.-V. Bach).
1
These authors contributed equally to this work.
https://doi.org/10.1016/j.jwpe.2019.101081
Received 5 September 2019; Received in revised form 23 October 2019; Accepted 22 November 2019
Available online 20 December 2019
2214-7144/ © 2019 Elsevier Ltd. All rights reserved.
S.I. Abba, et al. Journal of Water Process Engineering 33 (2020) 101081
health problems. On the other hand, the appropriate and adequate [29], principal component analysis (PCA) [24,25], linear discriminant
control may be achieved by introducing a robust tool for modeling the analysis [32] and kernel principal component analysis (KPCA) [22]. It
WTPs performance [5–8]. The WTP process is complicated due to the was reported that PCA and KPCA are commonly employed in dimension
dilute mixture of several compositions, quality, and characteristics of reduction, classification and feature extraction in the multidimensional
the system, which result in the difficulties in modeling WTP parameters data set [22,27,30]. However, in contrast to PCA, KPCA is capable of
[9–11]. Also, the biochemical and physical nature of WTPs exhibits handling and capturing non-linear interaction within the process due to
non-linear phenomena which are too complicated to simulate by simple it is kernel function. According to the aforementioned literature, it was
deterministic principles or mathematical models [12]. evident that several studies using AI models have been conducted and
In the last decades, different linear models have been widely pre- have shown promising performance, each model for a specific case.
sented for managing the overall performance of WTPs, but most of them Although there is no exceptional model that exhibits superiority over
have limitations and are incapable of meeting the standard of non- others, applying the knowledge of kernel input variables selection ap-
linear modeling systems [13]. On the other hand, non-linear artificial proach could lead to more promising outcomes. Due to the problems of
intelligence (AI) models such as artificial neural network (ANN), overfitting, local minima, and slow learning speed by some of the AI
adaptive neuro-fuzzy inference system (ANFIS), and support vector models such as ANN, a novel, new and emerging algorithm known as
machine (SVM) have shown merit to non-linear model systems of WTPs. extreme learning machine (ELM) model was proposed by Huang et al.
The models applied in hydro-environmental studies can be grouped into [34] to overcome the disadvantages of the traditional feed-forward
two categories, namely physical-based and data-driven models. Phy- backpropagation.
sical models are based on the concept of distributed (white-box) models However, to the best of the authors’ knowledge, this is the first study
that address the physical process and interaction for simulating the in which the Tambura WTP has been modeled using ELM, FFNN, and
hydro-environmental system. In contrast, data-driven models are based KPCA. The current study is aimed at the following: (i) to explore the
on lumped (black-box) models that acquire the optimal correlation potential of ELM with the kernel PCA for modeling the performance of
between inputs and outputs but neglect the physical process [14]. Be- the Tambura WTP in term of different physio-chemical parameters such
sides, several efforts have been made to improve the accuracy and re- as pH, turbidity, and hardness; (ii) to develop and compare the ELM
liability of the influent-effluents parameters in the field of hydro-en- with the traditional feed-forward neural network (FFNN) and classical
vironmental studies: however, no particular method proven to be autoregressive (AR) model using the same input combinations.
applicable in modeling the process [11,15,13].
With this perspective, it could be stated that there are no acceptable 2. Methods and modeling development
single model that can perform better than others in the different hydro-
environmental systems due to the dynamic and complex nature of the 2.1. Extreme learning machine (ELM)
data. This has necessitated the development of more reliable and effi-
cient models using the available data [14–19]. For instance, Al-baid- The ELM was recently developed as a new learning approach whose
hani and Alameedee [20] developed an ANN model to predict the ef- primary advantage is its ability to map the internal features without the
fluent pH and turbidity using various measured input parameters such need to iteratively tune the parameters of the hidden neuron as re-
as pH, temperature, and dose turbidity. The results demonstrated the quired in a traditional ANN model [34]. The input and hidden neuron
suitability of ANN in modeling the WTP parameters. Wu and Lo [21] weightings are computed randomly in the ELM from several pre-as-
used ANN and ANFIS models to compute the real-time coagulant dosage signed neurons without having to pass through all the neurons in the
in WTP using the measurements of turbidity, pH and colour. The out- model [35]. Also, the generalization capability of the ELM is accep-
comes demonstrated that the ANFIS model was capable of accurately table, and it requires less computation time [36–38]. As a newly
predicting the coagulant dosage with regard to the ANN model. Gaya emerging black-box data-driven algorithms, the ELM was first proposed
et al. [12] described the application of ANN and Hammerstein-Wiene by [34] and is comprised of single hidden layer feedforward networks
(H-W) models for forecasting the influent turbidity in WTPs using dif- (SLFNs). The ELM is quite different from the traditional FFNN because
ferent input parameters. The simulated results indicated that ANN it can overcome the problems of slow learning speed, local minima, and
could outperform the H-W model and may serve as an acceptable tool overfitting [33,31,34]. It is notable that the potential of the ELM could
for modeling the turbidity of WTP. be attributed to its generalization ability and fast learning speed [39].
Similarly, other researchers [22–24] were able to employ an ANN Due to it is promising performance ability, ELM has been applied in
for the prediction of optimum coagulants in WTPs. Yaseen et al. [25] various fields of hydro-environmental studies [40]. The structure of the
performed another study on the ELM application to forecast the daily ELM network used in this study is presented in Fig. 1.
time-scale (in a tropical environment) of the Johor River located in In this study, an ELM model was developed using calibration and
Malaysia. The research findings provided evidence showing the capa- validation data sets, as mentioned above. For a collection of N training
city of the ELM model in the region. Nadiri et al. [26] studied the samples (i.e., t = 1, 2, …, N ) in which x t ∈ d and yt ∈  , an SLFN
treatability of the Tabriz wastewater treatment plant (WWTP) using a with H hidden nodes, is mathematically expressed as [34]:
supervised committee fuzzy logic (SCFL), and committee fuzzy logic
H
(CFL) approaches. Different measured influent water quality (WQ)
parameters were used for the prediction of BOD, COD and TSS. The
∑ Bi gi (αi. xt + βi) = zt ,
i=1 (1)
predicted results indicated the advantage of SCFL approach over FL and
CFL. Manu and Thalla [27] employed SVM and ANFIS models for the where B ∈ H , Z (z t ∈ ) and G (α, β , x ) represent the predicted
simulation of Kjeldahl nitrogen in a domestic WWTP located at Man- weights in the output layer, model output and activation function of the
galore, India. The obtained historical data during the period from June hidden layer, respectively, while αi , βi , i and d indicate the weights of
2014 to September 2014, including the influent pH, TSS, BOD, and the randomized layers, biases of these randomized layers, the index of
Kjeldahl nitrogen was used to attain the target objectives. The outcomes the specific node in the hidden layer and the number of inputs, re-
demonstrated the potential of the SVM model in modeling the biolo- spectively.
gical processes in WWTP. Likewise, the performance of ANN in mod- As mentioned above, the sigmoid activation function is found the
eling chemical oxygen demand in WWTP was reported in [28] and [9]. best, and thus it is employed in this study as:
On the other hand, several data processing and input variable se- 1
lection methods have been applied in different prediction models in G (x ) =
1 + exp(−x ) (2)
order to improve the prediction accuracy, including sensitivity analysis
2
Fig. 1. Schematic of the ELM model.
In an ELM model, a proper number of hidden neurons, randomized ANN has been shown to be a useful tool for solving complex functions in
input layer weights (α), and randomized hidden layer biases (β) can different fields of water and environmental engineering [15,4].
lead to a zero error, which therefore produced the weights of the output The FFNN-BP algorithm involves the training of the network with
layer and can be obtained analytically for any training [34]: the trained input data that is processed through the system and then
N passed to the output layer; this might come with an error which is again
∑ ‖z t − yt ‖ = 0, propagated to the system until the desired target is achieved. The
t=1 (3) fundamental principle of FFNN-BP is to minimize the error in order to
The system of the linear equation can be used to obtain the value of understand the training data and subsequently estimate the actual
B for any input-output training samples: target [42]. BPNN is composed of three layers: input, hidden, and
output layers, as seen in Fig. 2. Several neurons that are present in the
Y = GB (4)
hidden layer can have an effect on the generalization ability and ca-
in which pacity of the neural network, which increases the computational
burden, whereby lower neurons cannot produce the desired prediction
⎡ g (x1) ⎤ ⎡ g1 (α1. x1 + β1) ⋯ gL (wH . x1 + βH ) ⎤ accuracy [29,43]. The linear activation function is used in the output
G (α, β , x ) = ⎢ ⋮ ⎥ = ⎢ ⋮ ⋯ ⋮ ⎥
layers, while the sigmoid activation is applied to the hidden and input
⎢ g (x ) ⎥ ⎢ g (α . x + β ) ⋯ g (w . x + β ) ⎥
⎣ N ⎦ ⎣ 1 N N 1 L H N F ⎦N × H layers. The activation function is a mathematical function which is
(5) introduced into each neuron for the conversion of a linear function into
a non-linear function.
and
T
⎡ B1 ⎤ 2.3. Kernel principal component analysis (KPCA)
B=⎢ ⋮ ⎥
⎢ T⎥
⎣ BH ⎦H × 1 (6) PCA as one of the common multivariate statistical techniques used
for reducing the dimension of high-volume data. The dimensionality
and reduction is normally achieved by randomly identifying the linear
T correlation between the variables [31]. However, as mentioned above,
⎡ y1 ⎤
standard PCA allows the linear dimensionally reduction, while KPCA
Y=⎢⋮⎥
⎢ T⎥ has been demonstrated to be a more powerful algorithm for mapping a
⎢ y
⎣ N⎥ ⎦N × 1 (7) non-linear process in the data set. The major importance of the kernel
where G is known as the hidden layer output, and T is the transpose of algorithm is the ability to operate without any non-linear optimization,
the matrix. The output weights B̂ can be estimated by inverting the which is contrary to other non-linear methods [37,38]. By applying this
matrix of the hidden layer using the Moore-Penrose generalized inverse method, input variables are changed and used as independent PC
function (+): variables [30]. Kaiser–Meyer–Olkin (KMO) is among the most com-
monly used statistics employed to assess the suitability of data in any
Bˆ = G+Y (8) factor analysis (FA) [13]. The classification of the KMO coefficient can
Subsequently, the estimated values ŷ can be determined by: be demonstrated as follows: Excellent ≥ 0.9, Very well = 0.8-0.89,
H
Well = 0.7-0.79, Mediocre = 0.6-0.69, Poor = 0.5-0.59 and Un-
yˆ = ∑ Bî gi (αi. xt + βi) acceptable < 0.5. The KMO coefficients and KMO index are presented
i=1 (9) in Eq. 4. More explanation of the PCA can be obtained in other studies
[13,45,46,44,14]. In this paper, a brief description of constructing
KPCA for dimensional reduction is provided.
2.2. Feedforward neural network (FFNN)
∑ ∑ rij2
FFNN with backpropagation (FFNN-BP) is one of the most widely KMO =
∑ ∑ rij2 + ∑ ∑ rij2 (10)
used ANN algorithms. It is a mathematical model which is aimed at
handling a non-linear relation between input-output sets of data. where rij is the correlation coefficient between the variable of i and j,
According to the history, ANNs are tools used in processing information and aij is the partial correlation coefficient between them.
which were derived and work like the biological nervous system of the Assuming that a non-linear transformation ∅ (x) from the original
brain, with a fundamental component known as a neuron (node) [41]. sample covariance matrix C in F space should fit the formula (Eq. 12),
3
Fig. 2. A three-layered feedforward neural network with backpropagation algorithm.
the projected new features have zero mean: k (X , Y ) = exp(−1 ‖X − Y ‖2 /2σ 2 (20)
N
1
∑ ∅ (Xi ) = 0
N i=1 2.4. Autoregressive (AR) model
(11)
N
1 AR is commonly used in time-series simulation because of the sto-
C= ∑ ∅ (Xi ) ∅ (Xi )T
N i=1 chastic process that was built with a degree of randomness and un-
(12)
certainty [48]. The AR model forecasts the value of a future process of
If the kernel function is defined as: any variable based on the prior values. In particular, the AR model is
k (Xi , Xj ) = ∅ (Xi )T ∅Xj the regression of values based on the previous occurrence. Therefore,
(13)
the AR model for an order p is defined as AR(p) and expressed as:
The matrix notation can be employed:
Xt = β1 Xt − 1 + β2 Xt − 2 + …εt (21)
K2 ak = λk NK ak, (14)
Where εt is white noise with E= (εt ) and VAR (εt ) = σe2 , the parameters
where, β1, β2, …βP are the AR coefficient [14].
Kij = k (Xi , Xj ), (15)
2.5. Proposed model development
and ak is the N-dimensional column vector of aki as:
ak = [ak1, ak 2 , …akN ] T (16) For any data-driven model, determination of proper input variables
is of paramount importance. Similarly, in time-series modeling, iden-
ak can be solved by tifying the appropriate time lags is an essential part of selecting the
K ak = λk Nak , (17) proper model input combinations. As such, autocorrelation function
(ACF) and partial ACF (PACF) are used. In a time-series, autocorrelation
and the resulting kernel principal components can be calculated using is considered as the correlation between the time-series, previous and
N forthcoming data points [36,49]. The proposed development of the
yk (X ) = ∅ (X )T vk = ∑ aki k (X , Xi ) current study is illustrated in Fig. 3, for ELM (KPCA-ELM) and FFNN
i=1 (18)
(KPCA-FFNN). From the model, it can be seen historically that recorded
If the projected dataset {φ(xi)} does not have zero mean, the Gram data are collected, pre-processed and normalized within the range of 0-
∼
matrix K can be used to substitute the kernel matrix K. The Gram matrix 1. For this purpose, three different data-driven algorithms (ELM, FFNN,
is given by: and AR) coupled with the KPCA were employed for modeling the per-
∼ formance of the Tambura WTP in Kano, Nigeria. At first, the KPCA
K = K − 1N K − K 1N + 1N KN (19)
algorithm is used to perform the dimension reduction of the variables.
where 1N is the N × N matrix with all elements equal to 1/N. Subsequently, the selected KPCA input variables are imposed into the
The power of the kernel methods is that it is not necessary to ELM, FFNN and AR models to determine the performance of the WTP
compute φ(xi) explicitly, the kernel matrix can be directly constructed (see Eq. 2).
from the training data set {xi} [47]. The standard steps of kernel PCA
dimensionality reduction can be summarized as (i) construct the kernel 2.5.1. Model validation
matrix K from the training data set {xi} using Eq. (15); (ii) compute the Generally, in AI models, the primary purpose is to fit the model to
∼
Gram matrix K using Eq. (19); (iii) use Eq. (14) to solve for the vectors the given data based on the employed indicators with the goal of
∼
ai (substitute K with K ); (iv) compute the kernel principal components achieving reliable prediction on the unknown data set. Due to the
yk(x) using Eq. (18). Two commonly used kernels are the polynomial overfitting problems, satisfactory training performance is not always in
kernel and Gaussian kernel. The current work employs the Gaussian agreement with the testing performance. Even though the ELM model
kernel function as: can handle the overfitting problems of the traditional FFNN, overfitting
4
Fig. 3. Structure of the proposed model development.
can still occur, especially when the data size is small. In the validation
process, different types of validation approach can be applied, including
cross-validation, which is called k-fold cross-validation, and others in-
clude holdout, leave one out. The holdout method is also considered as
a simpler version of k-fold, where the data is usually divided randomly
into two sets known as training and testing phase [42,43].
The k-fold cross-validation is a mechanism that was adopted in
order to avoid further overfitting. Also, the original training set is
equally partitioned into equal-sized subsets of k. From these k subsets,
one of the subsets was maintained and used for validation purpose,
while the remaining k-1 subsets were maintained and used for training
purpose. Therefore, the cross-validation method is then repeated k
times (the folds), where each k subset is utilized as the validation data,
in alternation. The final result of the performance of this training model
was the average of the k subsets’ validation performances. Mostly, the k
value can be determined through sample availability, usually from 2-
10. The major advantages of the k-fold cross validation mechanism are
Fig. 4. Illustration of k-fold cross-validation.
that in every single round, the validation set and the training sets are
independent [50,51]. This brings about a performance objective which
creates a sound foundation for optimizing the model [52]. Apart from absolute percentage error (MAPE).
this, implementing cross-validation has the ability to improve the effi- N
∑i = 1 [Yobsi − Ycomi]2
ciency of data usage. Generally, in model configuration, the overall data NC = 1 − N
set is classified into three independent sets: model calibration set, test ∑i = 1 [Yobsi − Y¯obsi]2 (22)
set, and validation set. Sometimes, sample sizes can be small and this
N
can lead to a lack of or poor sample representation. Through the in- ∑i = 1 (Yobsi − Ycomi )2
RMSE =
volvement of cross-validation, the validation set and calibration set are N (23)
combined together as a whole. Therefore, the overall data can be
N
classified into two sets. By the k-fold of a randomly dynamic division of 1⎡ Y −Y
training samples, this model is more objective and stable [52]. As stated
MAPE = ∑ obsiY comi ⎤⎥
N ⎢ i=1
⎣ obsi ⎦ (24)
above, the obtained data is divided into two samples (training = 75 %
and testing = 25 %) considering the 4-fold cross-validation. It is note- where N, Yobsi , Ȳobsi and Ycomi are data number, observed data, average
worthy that other approaches for validating and portioning the data value of the observed data and computed values, respectively.
could be used (see Fig. 4).
3. Case study and data description
2.5.2. Evaluation criteria The Tamburawa water treatment plant (TWTP) in Kano (Nigeria),
Different evaluation criteria can be used to determine the com- like other conventional water treatment plants, has the capacity to
parative accuracy of the predictive models; as such, a multi-criteria produce 150 ML of potable water per day to cover the communities in
indicator for measuring the model’s performance was used in the cur- Kano city and the surroundings (see Fig. 5a). The raw water from the
rent study, namely Nash-Sutcliffe (NC) as a goodness-of-fit and two source is pumped via a pump station and then enters a preliminary
statistical error including root mean squared error (RMSE) and mean treatment unit where grits and some of the suspended solids are
5
Fig. 5. (a) Map of the Tambura WTP (b) operational process of the plant (c) concentration of the raw and treated (pH, Turb, TDS, and Hard).
6
Table 1 Table 3
Descriptive statistics of the data. Eigenvalue and percentage of data explained by each factor.
Parameters X̄ Xmax Xmin Number Eigenvalue Difference Proportion Value Proportion
pHr 7.6813 11.5000 6.5000 1 2.3721 0.1956 0.2965 2.3721 0.2965

Turbr (NTU) 200.7667 1796.0000 51.0000 2 2.1765 0.9684 0.2721 4.5487 0.5686
Condr (mS/cm) 116.5067 257.0000 53.0000 3 1.2082 0.2390 0.1510 5.7568 0.7196
TDSr (mg/L) 57.4705 106.400 17.9000 4 0.9692 0.3511 0.1212 6.7261 0.8408
Hardr (mg/L) 36.3028 53.8700 24.6100 5 0.6181 0.1502 0.0773 7.3442 0.9180
Clr (mg/L) 13.4560 33.5600 8.8800 6 0.4679 0.3044 0.0585 7.8120 0.9765
SSr (mg/L) 154.200 1248.0000 34.0000 7 0.1635 0.1389 0.0204 7.9755 0.9969
Fer(mg/L) 3.8977 32.0000 1.0400 8 0.0245 — 0.0031 8.0000 1.0000
pHt 6.4370 8.8000 5.0000
Turbt (mg/L) 0.8181 3.4100 0.2000
TDSt (mg/L) 80.1022 181.1000 4.2000
Hardt (mg/L) 29.0826 44.8900 17.9600
X̄ , Xmax and Xmin indicate the mean, maximum and minimum, respectively.
removed to avoid pump wear and pipe deterioration. Fig. 5b shows the
schematic flow chart of the important operational process. The opera-
tional process contains rapid mix, coagulation/flocculation, sedi-
mentation, filtration, disinfection and final treated water, which can be
distributed to different sources such as domestic, commercial and in-
stitutional [12]. The historical recorded data from TWTP contained raw
and treated turbidity (Turbr and Turbt) (NTU), total dissolve solid (TDSr
and TDSt) (mg/L), suspended solid (SSr and SSr) (mg/L), pH (pHr and
pHt), hardness (Hardr and Hardt) (mg/L), conductivity (Condr and
Condt) (mS/cm), Chloride content (Clr and Clt) (mg/L) and Iron content
(Fer and Fet) (mg/L). Table 1 shows the descriptive statistical analysis
used for studying the data. The concentration of the raw and treated pH
Turb, TDS and Hard (mg/L) at the exit before the discharge to the re-
ceiving body is shown in Fig.5c.
4. Results and discussions
The main motivation of this study is to explore the new data-driven

algorithm known as emerging extreme learning machine (ELM) for
public and environmental sustainability; in other words, for the de-
termination of WTP performance analysis. As stated in the introductory
section, according to WHO and UNICEF, water is one of the most in-
dispensable factors needed to sustain life. A similar statement was made
by UNESCO, who claimed that WTPs are core components needed to
attain sustainable development and are crucial for public and en- Fig. 6. The eigenvalue and cumulative variance versus the number of factors.
vironmental health. The emerging approach ELM was evaluated and
compared with other data-driven models (FFNN, AR), and the conclu-
factors that affect the overall modeling accuracy. Therefore, this paper
sion was established in this section.
explored the method of choosing eigenvalues equal to or greater than
Two different models were developed for modeling the performance
1.00, as shown in Table 3. Holland [54] reported that in any correlation
of the Tambura WTP, as shown in Table 2. The first model (M1) was
matrix, eigenvalues are used to condense the variance where the
developed using all the input combinations, while the second model
highest eigenvalues (1 and above) are traditionally considered for any
(M2) was constructed using kernel principal component analysis
analysis by eigenvectors ranking. Fig. 6 shows the specific values and
(KPCA) for appropriate input variable selection and dimensionality
the percentage cumulative variance of each factor as a graph, which
reduction. The new kernel principal components (KPCs) as M2 were
demonstrates 8 input variables with the corresponding 8 eigenvectors
considered as the new input variables of the proposed models (FFNN,
and eigenvalues. Similarly, Table 3 shows the value of each factor and
ELM, and AR). It should be noted that identifying a maximum number
its percentage of separation from the primary variable. It can be seen
of hidden neurons, iterations, transfer function, and best structure is
from the table that more than 84 % of the factors were explained by the
crucial in designing a different kind of ANN in order to develop the best
first 4 variables. Likewise, the result indicated that up to 5 factors have
model [53].
a significant percentage contribution of more than 90 %, as demon-
In KPCA, various methods were used for identifying the proper
strated in Fig. 6.
As with other data-driven algorithms, the optimal parameters of AR
Table 2
were chosen using different trial and error procedures. The model was
The developed model input combination.
fitted to the training data to calibrate the model, and then the trained
Model Name Models Input variable model was subsequently used to find the values of the testing data. In
FFNN, ELM, AR M1 pHr + Turbr + TDS +
this research, the AR model was developed using a discrete domain
Hardr + Clr + Condr + Fer + SS with a structural order of [na, nb, nk] = [4 [44] 1] in the MATLAB 9.3
KPCA- (FFNN, ELM, M2 Turb + SSr + Hardr + Clr (R2017a) system identification toolbox. However, the obtained results
AR) of the FFNN, ELM and AR models were integrated with the KPCA
7
Table 4
Performance results for pHt, Turbt, TDSt and Hardt.
Parameter Model Type Training Testing
NC RMSE MAPE NC RMSE MAPE
pHt M1-FFNN 0.8705 0.0009 0.4241 0.8431 0.0009 0.4872

M2-KPCA- 0.7958 0.0024 0.4711 0.7019 0.0025 0.4747
FFNN
Turbt M1-FFNN 0.9574 0.0004 1.2847 0.9527 0.0005 1.2996
M2-KPCA- 0.9883 0.0002 1.5667 0.9874 0.0005 1.7985
FFNN
TDSt M1-FFNN 0.8016 0.0020 2.2180 0.8585 0.0032 2.2410
M2-KPCA- 0.9252 0.0017 2.0694 0.9432 0.0020 2.7131
FFNN
Hardt M1-FFNN 0.9549 0.0003 0.2500 0.9147 0.0004 0.2650
M2-KPCA- 0.9657 0.0002 0.2635 0.9116 0.0003 0.2978
FFNN
pHt M1-ELM 0.9020 0.0020 0.4100 0.9450 0.0030 0.4870
M2-KPCA- 0.8100 0.0030 0.4300 0.8440 0.0030 0.4660
ELM
Turbt M1-ELM 0.9660 0.0010 0.0240 0.9010 0.0010 0.0280
M2-KPCA- 0.9968 0.0000 0.0210 0.9920 0.0000 0.0250
ELM
TDSt M1-ELM 0.8830 0.0030 3.1740 0.8890 0.0040 3.9860
M2-KPCA- 0.9460 0.0010 3.0100 0.9680 0.0010 3.3220
ELM
Hardt M1-ELM 0.9960 0.0000 0.2010 0.9900 0.0000 0.2770
M2-KPCA- 0.9990 0.0000 0.1680 0.9970 0.0000 0.1740
ELM
pHt M1-AR 0.8334 0.0080 0.3228 0.8027 0.0092 0.3593
M2-KPCA-AR 0.7210 0.0025 0.4193 0.6063 0.0028 0.4488
Turbt M1-AR 0.5925 0.0013 0.8675 0.5427 0.0018 0.8231
M2-KPCA-AR 0.6002 0.0012 2.5226 0.5991 0.0016 2.5462
TDSt M1-AR 0.5173 0.0010 1.6282 0.5001 0.0018 1.6436
M2-KPCA-AR 0.5991 0.0010 5.3044 0.6726 0.0010 5.3609
Hardt M1-AR 0.5007 0.0095 0.2717 0.5999 0.0095 0.2889
M2-KPCA-AR 0.6020 0.0013 0.2417 0.6726 0.0019 0.2644
algorithm to determine the performance of the Tambura WTP and are

presented in Table 4. It can be observed from the overall comparison
that all the model combinations except for the AR model demonstrated
acceptable performance in the modeling horizon; this can be proved by
considering the NC values that are greater than 80 % in both training
and testing.
It is worth mentioning that the average performance efficiency in
terms of RMSE and MAPE values for all the three models shows a re-
liable and satisfactory accuracy. This can be attributed to the cross-
validation process carried out prior to the model calibration, which has
a higher importance in model evaluation [12,44]. Table 4 shows that
M1 with 8 input variables produced the highest prediction accuracy in
all the three models in terms of NC, RMSE, and MAPE for modeling the
performance of pHt. Among the models, M1-ELM with NC (0.9450),
RMSE (0.0030) and MAPE (0.4870) values in the testing phase, had the
highest level of accuracy in comparison to M1-FFNN and M1-AR. It is
quite interesting to note that AR as the classical linear model demon-
strated a satisfactory ability to handle the performance prediction of
pHt; this is due to the promising skill of the AR in solving autoregressive
time-series processes such as the Tambura WTP. The time-series fitting
between the observed and predicted pHt is shown in Fig. 7a. From the Fig. 7. Trends of time-series plots for the best model in testing phase (a) pHt (b)
figure, it can be clearly seen that the overall performance of M1-ELM Turbt (c) TDSt (d) Hardt.
combinations is superior to both FFNN and AR model combinations.
Also, it is predicted time-series trends demonstrating better agreements river receiving effluents from surrounding industries into the Kano
with the observed values than other two models. In terms of overall river. It could also be as a result of anthropogenic activities taken place
percentage accuracy, M1-ELM outperformed and increased the pre- regularly. The low pH value is a result of agricultural runoffs as well as
dictive performance up to 10 % and 14 % with regard to the M1-FFNN the irrigational activities (with varying pH conditions) into the river.
and M1-AR models, respectively. Fig. 8a shows the scatter plot of pHt Similarly, significant parts of the urban Kano population rely
for the best performing model in the testing phase. heavily on the water for their domestic activities. According to [56], the
The low prediction performance of pH could be due to the agri- pH value of drinking water set by WHO ranges from 6.5–8.5. The pH of
cultural and industrial activities along with the Kano river system. both raw water is slightly alkaline (above 8.0) due to various substances
According to [55], high pH value is due to infiltration of the Challawa such as salts, nitrogen and phosphate, dissolved solids but after the
8
Fig. 8. Scatter plots for the best model in testing phase (a) pHt (b) Turbt (c) TDSt (d) Hardt.
treatment processes, the addition of chlorine (chlorination process after combination yielded the best performance outcomes in modeling Turbt.
the addition of lime to correct the pH) at the distribution points raises Furthermore, an explanation of the results revealed that for predicting
the treated pH of the water to be slightly acidic (6.0–6.9) to kill all the the performance of Turbt in the Tamburawa WTP, KPCA-ELM with NC
microorganisms in the pipes and pumps along the distribution line to (0.9920), RMSE (0.0000) and MAPE (0.0250) values in the testing
the points of use and this process reduces the acidity making the water phase, proved merit over KPCA-FFNN and KPCA-AR and therefore
to be neutral (pH = 7.0–7.5). It is important to note that the low pH emerged as a reliable model. The KPCA-FFNN model can also serve the
could also be attributed to the large quantity of chemicals used as prediction purpose efficiently despite being outperformed by the KPCA-
fertilizer in the form of NPK (nitrogen: phosphorus: potassium) during ELM models. Similarly, it is quite interesting to note that, there is a
the irrigational activities. These issues can be reduced using organic small increase in the prediction performance of KPCA-ELM with regard
matter which has relatively neutral pH. Despite, the probable changes to KPCA-FFNN model and an approximate 40 % increase for the KPCA-
of pH values as a boundary condition, it should be noted that the AR model. Figs. 7 and 8b demonstrate the time-series and scatter plots
proposed methods handle the initial condition of the data and valida- for the best model in the testing phase. The overall prediction results
tion process was carried out as mentioned earlier in Section 2.5.1. To depict that, with the dimensionally reduced number of input variables,
further increase the quantitative accuracy of pH in the TWTP, an en- the performance accuracy increased in the case of Turbt.
semble technique can be employed. The ensemble technique is an ap- On the other hand, TDS, which is comprised of organic salt, is
proach employed to combine the process of multiple predictors in order considered to be one of the major organic substances that contribute to
to enhance the final performance. This technique has been used with the deterioration of water quality. As such, the obtained results for this
promising success in several fields including hydro-environmental en- variable are presented in Table 4. The M2-KPCA-ELM model out-
gineering, data mining and statistics as an approach to improve the performed all models with reasonable accuracy in both training and
prediction skill. The main goal for this technique follows the concept of testing steps. According to the results, it can be observed that M2-KPCA
improving the performance of the single model by combining the re- with NC (0.9680), RMSE (0.0010) and MAPE (3.3320) values in testing
sults of the various individual models. As the result, ensemble approach phase 4 input combination exhibited the best performance accuracy
can increase the prediction performance with regards to a single model and therefore proved to be a reliable model for prediction of TDSt for
[57–59]. Hence, this approach is also expected to improve the accuracy the Tamburawa WTP. A review of the TDSt results indicates that the
of pH prediction for the TWTP. However, it is worth noting that the KPCA-ELM model increased the prediction accuracy up to 3 % and 30 %
ensemble approach normally requires a lot of computational time with regard to KPCA-FFNN and KPCA-AR, respectively. Furthermore, it
which suggests the use of kernel optimization functions. is clearly shown in Table 4 that TDSt had the highest MAPE values in
Turbidity mostly provides cover and food for pathogens and, if not both training and testing for all the models than the other parameters in
effectively removed, turbidity can cause an outbreak of waterborne the WTP. This indicated the large size of error accumulated in per-
diseases [12]. Table 4 presents the performance results of modeling the centage. According to [12], the smaller the MAPE, the more accurate
treated turbidity for all three models. A direct comparison between the the prediction performance with a range from 0 to 10% as the best
models indicates that almost all the non-linear model combinations MAPE. Figs. 7 and 8c demonstrate the time-series and scatter plots for
attained appreciable performances with respect to NC, RMSE, and the best model in the testing phase. The overall prediction results depict
MAPE. It can be observed that the integration of KPCA with four input that with a dimensionally reduced number of input variables, the
9
Fig. 9. Taylor diagram depicting the best performance of (a) pH (c)Turb (d) TDS and (d) Hard during the testing phase.
performance accuracy increased in the case of TDSt. the other two models, with the slight exception of the modeling of pHt.
Lastly, the modeling of Hardt in terms of NSE, RMSE, and MAPE is The predictive results can be also evidenced by considering the high
presented in Table 4. It can be seen from the table that the KPCA-ELM value of correlation (R) which was attributed to the KPCA-ELM. Gen-
model outperforms the other two models in terms of all the perfor- erally, if the standard deviation (SD) of the computed values is higher
mance criteria. The results also depict that M2 with 4 input combina- than the SD of the observed values, then it will result in overestimation
tions served as the best model for assessing the performance of the and vice versa. It can be clearly observed that the emerging ELM de-
Tamburawa WTP. M2-KPCA-ELM is superior to all other combinations monstrated promise in the non-linear process, which is not surprising as
with values of NC = 0.9970, RMSE = 0.0000 and MAPE = 0.1740 in ELM has vividly demonstrated excellent performance in terms of
the testing phase. An additional comparison of the results indicated modeling and prediction in recent decades in the field of hydro en-
that, with regard to percentage variation and accuracy, the M2-KPCA- vironmental engineering [25].
ELM model increased by 8 % and 32 % in comparison to KPCA-FFNN It is indeed crucial to state that there is performance uniformity of
and KPCA-AR, respectively. The forecasts for each model are re- the models in terms of prediction results; in other words, there is no
presented in Figs. 7 and 8d in the form of a time-series and scatterplot, exceptional model that exhibited superiority over the others. Generally,
respectively. data-driven models behave differently in accordance with the processes
In order to capture to detail of the three predictive data-driven al- of learning [60]. As such, it is important to validate the current work
gorithms, a two-dimensional method that exhibits how closely a model outcomes with the established technical literature. For instance, some
or different model matches the observed and corresponding computed studies [3,12,21,61–63] reported significant performance of data-
values, i.e. Taylor diagram [47,48], is constructed to visualize the in- driven models in WTP analysis using various input variables and per-
formation in Fig. 9. The Taylor diagram is also the most widely re- formance indicators. To summarize the discussion section, the proposed
commended diagram for accuracy comparison due to the advantageous evolutionary ELM algorithm coupled with KPCA was found to have
nature of combining and quantifying multiple statistical performance excellent prediction skills for modeling the performance of the Tambura
metrics by comparing the similarity between the measured and pre- WTP with regard to the application of single data-driven intelligence
dicted values in one diagram [52,34,16]. However, in accordance with models. The key advantage of the ELM model was due to its promising
the visualized graphical interpretation, the KPCA-ELM model was closer ability to overcome the disadvantages of the traditional feedforward
to the target measured values for all the variables in comparison with backpropagation [25]. On the other hand, PCA has been applied
10
successfully in the analysis of environmental engineering problems References

[31].
Even though ELM as the emerging state-of-art model had demon- [1] WHO and Unicef, Progress on sanitation and drinking water 2013 update, World
strated an excellent model for predicting hydro-environmental and Health 1 (October) (2013) 1–40.
[2] B. Maryam, H. Büyükgüngör, Wastewater reclamation and reuse trends in Turkey:
hydrological process, the results also were compared with other opportunities and challenges, J. Water Process Eng 30 (September) (2019) pp. 0–1.
learning approaches. For example, Zhang et al. [64] compared the long [3] F.N. Ogwueleka, T.C. Ogwueleka, Optimization of drinking water treatment pro-
short-term memory (LSTM) with SVR and FFNN for optimizing inter cesses using artificial neural network, Niger. J. Technol. 28 (1) (2009) 16–25.
[4] UNESCO, Water Supply, Sanitation and Health, (2015) [Online]. Available:http://
catchment wastewater transfer. The obtained outcomes showed that www.unesco.org/new/en/natural-sciences/environment/water/wwap/facts-and-
LSTM outperformed FFNN and SVR with 0.7 % and 0.17 %, respec- figures/water-supply-sanitation-and-health/ [Accessed: 20-Oct-2019].
tively, in term of determination coefficient. Vijai and Sivakumar [65] [5] A. Estim, S. Saufie, S. Mustafa, Water quality remediation using aquaponics sub-
systems as biological and mechanical filters in aquaculture, J. Water Process Eng.
studied the comparison among DNN, ANN, ELM, LSSVM, Gaussian 30 (February) (2019) p. 100566.
process regression (GPR), random forest (RF) and multiple regression [6] V. Gitis, N. Hankins, Water treatment chemicals: trends and challenges, J. Water
for the estimation of water demand. The results demonstrated that ANN Process Eng. 25 (February) (2018) 34–38.
[7] V. Nourani, G. Elkiran, S.I. Abba, Wastewater treatment plant performance analysis
emerged the best performing model in terms of determination of coef-
using artificial intelligence - an ensemble approach, Water Sci. Technol. 78 (10)
ficient (R2), RMSE, mean square error (MSE) and mean absolute error (2018) 2064–2076.
(MAE). It could be noted that techniques such as ELM and DNN have [8] M.A. Al-Ghouti, M.A. Al-Kaabi, M.Y. Ashfaq, D.A. Da’na, Produced water char-
not performed as well. Unlike ANN in ELM, the weights are learned in a acteristics, treatment and reuse: a review, J. Water Process Eng 28 (February)
(2019) 222–239.
single step with a single hidden layer and it is neither being updated. [9] S.I. Abba, G. Elkiran, Effluent prediction of chemical oxygen demand from the as-
Deep learning techniques are known for learning complex correlations tewater treatment plant using artificial neural network application, Procedia
between input and output variables. Inoue et al. [66] proposed the Comput. Sci. 120 (2017) 156–163.
[10] D. Hanbay, I. Turkoglu, Y. Demir, Prediction of wastewater treatment plant per-
application of deep neural networks (DNN) and one-class support formance based on wavelet packet decomposition and neural networks, Expert Syst.
vector machines (SVM) for anomaly detection in water treatment sys- Appl. 34 (2) (2008) 1038–1043.
tems. The results indicated the capability of both the models with slight [11] E. Belia, et al., Wastewater treatment modelling: dealing with uncertainties, Water
Sci. Technol. 60 (8) (2009) 1929–1941.
precision of DNN over SVM model. The overall comparison justified [12] M.S. Gaya, et al., Estimation of turbidity in water treatment plant using hammer-
that there is no unique or exceptional model that performed superior to stein-wiener and neural network technique, Indones. J. Electr. Eng. Comput. Sci. 5
others. It is obvious that the performance of one technique may surpass (3) (2017) 666–672.
[13] A. Solgi, A. Pourhaghi, R. Bahmani, H. Zarei, Improving SVR and ANFIS perfor-
that of another; and when different sets of data are used, the results may mance using wavelet transform and PCA algorithm for modeling and predicting
be entirely opposite [35]. biochemical oxygen demand (BOD), Ecohydrol. Hydrobiol. 17 (2) (2017) 164–175.
[14] S.J. Hadi, M. Tombul, Forecasting daily streamflow for basins with different phy-
sical characteristics through data-driven methods, Water Resour. Manag. 32 (10)
5. Conclusions (2018) 3405–3422.
[15] A. Danandeh Mehr, E. Kahya, E. Olyaie, Streamflow prediction using linear genetic
programming in comparison with a neuro-wavelet technique, J. Hydrol. 505 (2013)
In this study, an emerging evolutionary data-intelligence model 240–249.
called extreme learning machine (ELM) coupled with kernel principal [16] G. Elkiran, V. Nourani, S.I. Abba, Multi-step ahead modelling of river water quality
component analysis (KPCA) is proposed to determine the performance parameters using ensemble artificial intelligence-based approach, J. Hydrol. 577
(October) (2019) p. 123962.
of the Tamburawa water treatment plant (WTP) located in Kano
[17] V. Nourani, G. Elkiran, J. Abdullahi, A. Tahsin, Multi-region modeling of daily
(Nigeria) in terms of treated pH, turbidity, total dissolved solids, and global solar radiation with artificial intelligence ensemble, Nat. Resour. Res.
hardness. Feedforward neural network (FFNN) and autoregressive (AR) (2019).
models were also employed for comparison. The historically recorded [18] Z.M. Yaseen, S.O. Sulaiman, R.C. Deo, K.W. Chau, An enhanced extreme learning
machine model for river flow forecasting: state-of-the-art, practical applications in
data obtained from the plant were used for the prediction with a k-fold water resource engineering area and future research direction, J. Hydrol. 569
cross-validation approach. Two different model combinations were (October) (2019) 387–408.
built for every single model (FFNN, ELM, and AR) and KPCA algorithm [19] R.S. Govindaraju, Artificial neural networks in hydrology. I: Preliminary concepts,
J. Hydrol. Eng. 5 (2) (2000) 115–123.
(KPCA-FFNN, KPCA-ELM, and KPCA-AR). The predictive results re- [20] J.H. Al-baidhani, M.A. Alameedee, Prediction of water treatment plant outlet tur-
vealed the potential of KPCA-ELM with a high level of accuracy over the bidity using artificial neural network, Int. J. Curr. Eng. Technol. 7 (4) (2017)
comparable single models for all the considered variables with a slight 1559–1565.
[21] G. De Wu, S.L. Lo, Predicting real-time coagulant dosage in water treatment by
exception in terms of pH prediction. The results also depicted that both artificial neural networks and adaptive network-based fuzzy inference system, Eng.
ELM and FFNN models demonstrated prediction skill and can, there- Appl. Artif. Intell. 21 (8) (2008) 1189–1195.
fore, serve as reliable models. The results of this study may contribute [22] T. Yu, S. Yang, Y. Bai, X. Gao, C. Li, Inlet water quality forecasting of wastewater
treatment based on kernel principal component analysis and an extreme learning
to the mentioned modeling of the treated parameters and provide a
machine, Water (Switzerland) 10 (7) (2018).
reference benchmark for wastewater management and control in the [23] S. Haghiri, A. Daghighi, S. Moharramzadeh, Optimum coagulant forecasting by
Tamburawa WTP. The outcomes also suggest that other algorithms such modeling jar test experiments using ANNs, Drink. Water Eng. Sci. 11 (1)
(2018) 1–8.
as LSTM, RF, DNN, DBNN and ensemble learning may be applied with
[24] L.S. Gomes, F.A.A. Souza, R.S.T. Pontes, T.R.F. Neto, R.A.M. Araújo, Coagulant
the combination of KPCA in order to develop a new model which could dosage determination in a water treatment plant using dynamic neural network
produce higher accuracy and more reliable estimates. models, Int. J. Comput. Intell. Appl. 14 (3) (2015) 1–18.
[25] Z.M. Yaseen, M.F. Allawi, A.A. Yousif, O. Jaafar, F.M. Hamzah, A. El-Shafie, Non-
tuned machine learning approach for hydrological time series forecasting, Neural
Declaration of Competing Interest Comput. Appl. 30 (5) (2018) 1479–1491.
[26] A.A. Nadiri, S. Shokri, F.T.C. Tsai, A. Asghari Moghaddam, Prediction of effluent
quality parameters of a wastewater treatment plant using a supervised committee
The authors declare that they have no known competing financial fuzzy logic model, J. Clean. Prod. 180 (February) (2018) 539–549.
interests or personal relationships that could have appeared to influ- [27] D.S. Manu, A.K. Thalla, Artificial intelligence models for predicting the perfor-
ence the work reported in this paper. mance of biological wastewater treatment plant in the removal of Kjeldahl Nitrogen
from wastewater, Appl. Water Sci. 7 (7) (2017) 3783–3791.
[28] N. Bekkari, A. Zeddouri, Using artificial neural network for predicting and con-
trolling the effluent chemical oxygen demand in wastewater treatment plant,
Acknowledgements
Manag. Environ. Qual. An Int. J. 30 (3) (2019) 593–608.
[29] G. Elkiran, V. Nourani, S.I. Abba, J. Abdullahi, Artificial intelligence-based ap-
The authors wish to thank the staff of the Tamburawa Water proaches for multi-station modelling of dissolve oxygen in river, Glob. J. Environ.
Treatment Plant, Kano (Nigeria) for supplying the available data used Sci. Manag. 4 (4) (2018) 439–450.
[30] R. Noori, M.A. Abdoli, A. Ameri Ghasrodashti, M. Jalili Ghazizade, Prediction of
to conduct this research.
11
municipal solid waste generation with combination of support vector machine and moving average, artificial neural network, and wavelet artificial neural network
principal component analysis: a case study of mashhad, Environ. Prog. Sustain. methods for urban water demand forecasting in Montreal, Canada, Water Resour.
Energy 28 (2) (2009) 249–258. Res. 48 (1) (2012) 1–14.
[31] M. G. E. D. Z. S. J. R. B, Miodrag Belosevic, Decomposition analysis on influence [49] S.I. Abba, et al., Modelling of Uncertain system: a comparison study of linear and
factors of direct household, Environ. Sci. Technol. 33 (2) (2014) 482–489. non-linear approaches, IEEE (2019) 1–6.
[32] C.H. Park, H. Park, A comparison of generalized linear discriminant analysis al- [50] S.J. Aboud, M. Al Fayoumi, M. Alnuaimi, Verification and validation of simulation
gorithms, Pattern Recognit. 41 (3) (2008) 1083–1097. models, Handb. Res. Discret. Event Simul. Environ. Technol. Appl. (2009) (2009)
[33] X. Xin, et al., Insights into the toxicity of triclosan to green microalga Chlorococcum 58–74.
sp. using synchrotron-based fourier transform infrared spectromicroscopy: bio- [51] N. Tsioptsias, A. Tako, S. Robinson, Model validation and testing in simulation: a
physiological analyses and roles of environmental factors, Environ. Sci. Technol. 52 literature review, OpenAccess Ser. Informatics 50 (6) (2016) 6.1–6.11.
(4) (2018) 2295–2306. [52] T. Zhou, F. Wang, Z. Yang, Comparative analysis of ANN and SVM models com-
[34] G.-B. Huang, Q.-Y. Zhu, C.-K. Siew, Extreme learning machine: theory and appli- bined with wavelet preprocess for groundwater depth prediction, Water
cations, Neurocomputing 70 (1–3) (2006) 489–501. (Switzerland) 9 (10) (2017).
[35] S.J. Hadi, S.I. Abba, S.S. Sammen, S.Q. Salih, N. Al-Ansari, Z.M. Yaseen, Non-linear [53] E. Olyaie, H. Zare Abyaneh, A. Danandeh Mehr, A comparative analysis among
input variable selection approach integrated with non-tuned data intelligence computational intelligence techniques for dissolved oxygen prediction in Delaware
model for streamflow pattern simulation, IEEE Access 7 (2019) 141533–141548. River, Geosci. Front. 8 (3) (2017) 517–527.
[36] Z.M. Yaseen, et al., Stream-flow forecasting using extreme learning machines: a case [54] C. Holland, Principal components analysis, Encycl. Ecol. (2008) 2940–2949 Five-
study in a semi-arid region in Iraq, J. Hydrol. 542 (2016) 603–614. Volume Set, no. July.
[37] Z.M. Yaseen, S.O. Sulaiman, R.C. Deo, K.W. Chau, An enhanced extreme learning [55] H. Ahmad, I. Indabawa, A study of algal species of Kano River, Tamburawa, Kano
machine model for river flow forecasting: state-of-the-art, practical applications in State, Nigeria, Bayero J. Pure Appl. Sci. 8 (1) (2015) 42.
water resource engineering area and future research direction, J. Hydrol. 569 [56] A. Zakari, V.A. Ikudayisi, S.I. Giwa, Quality assessment of the changes in the phy-
(August) (2019) 387–408. sico-chemical parameters in Pipe-Borne water supplied in Kano Metropolis, Nigeria,
[38] S. Zhu, S. Heddam, S. Wu, J. Dai, B. Jia, Extreme learning machine-based prediction IOSR J. Appl. Chem. 7 (11) (2014) 74–81.
of daily water temperature for rivers, Environ. Earth Sci. 78 (6) (2019) p. 0. [57] Y. Khan, S.S. Chai, Ensemble of ANN and ANFIS for water quality prediction and
[39] V. Nourani, G. Andalib, F. Sadikoglu, Multi-station streamflow forecasting using analysis - a data driven approach, J. Telecommun. Electron. Comput. Eng. 9 (2–9)
wavelet denoising and artificial intelligence models, Procedia Comput. Sci. 120 (2017) 117–122.
(2017) 617–624. [58] S.E. Kim, I.W. Seo, Artificial Neural Network ensemble modeling with conjunctive
[40] Z.M. Yaseen, S.O. Sulaiman, R.C. Deo, K.W. Chau, An enhanced extreme learning data clustering for water quality prediction in rivers, J. Hydro-Environ. Res. 9 (3)
machine model for river flow forecasting: state-of-the-art, practical applications in (2015) 325–339.
water resource engineering area and future research direction, J. Hydrol. 569 [59] I. Partalas, G. Tsoumakas, E.V. Hatzikos, I. Vlahavas, Greedy regression ensemble
(2019) 387–408. selection: theory and an application to water quality prediction, Inf. Sci. (Ny) 178
[41] S.I. Abba, G. Elkiran, Effluent prediction of chemical oxygen demand from the as- (20) (2008) 3867–3879.
tewater treatment plant using artificial neural network application, Procedia [60] O. Kisi, Z.M. Yaseen, The potential of hybrid evolutionary fuzzy intelligence model
Comput. Sci. 120 (2017) 156–163. for suspended sediment concentration prediction, Catena 174 (May) (2019) 11–23.
[42] M.S. Gaya, N. Abdul Wahab, Y.M. Sam, S.I. Samsudin, Anfis modelling of carbon [61] K.E. Taylor, Summarizing multiple aspects of model performance in a single dia-
and nitrogen removal in domestic wastewater treatment plant, J. Teknol. Sciences gram, J. Geophys. Res. Atmos. 106 (D7) (2001) 7183–7192.
Eng. 67 (5) (2014) 29–34. [62] C.M. Kim, M. Parnichkun, Prediction of settled water turbidity and optimal coa-
[43] S.I. Abba, S.J. Hadi, J. Abdullahi, River water modelling prediction using multi- gulant dosage in drinking water treatment plant using a hybrid model of k-means
linear regression, artificial neural network, and adaptive neuro-fuzzy inference clustering and adaptive neuro-fuzzy inference system, Appl. Water Sci. 7 (7) (2017)
system techniques, Procedia Comput. Sci. 120 (2017) 75–82. 3885–3902.
[44] Y. Zhang, Enhanced statistical analysis of nonlinear processes using KPCA, KICA [63] A. Maleki, S. Nasseri, M.S. Aminabad, M. Hadi, Comparison of ARIMA and NNAR
and SVM, Chem. Eng. Sci. 64 (5) (2009) 801–811. models for forecasting water treatment plant’s influent characteristics, KSCE J. Civ.
[45] B. Schölkopf, A. Smola, K.R. Müller, Nonlinear component analysis as a kernel ei- Eng. 22 (9) (2018) 3233–3245.
genvalue problem, Neural Comput. 10 (5) (1998) 1299–1319. [64] D. Zhang, E.S. Hølland, G. Lindholm, H. Ratnaweera, Hydraulic modeling and deep
[46] K.P. Singh, A. Malik, D. Mohan, S. Sinha, Multivariate statistical techniques for the learning based flow forecasting for optimizing inter catchment wastewater transfer,
evaluation of spatial and temporal variations in water quality of Gomti River (India) J. Hydrol. 567 (November) (2018) 792–802.
- A case study, Water Res. 38 (18) (2004) 3980–3992. [65] P. Vijai, P. Bagavathi Sivakumar, Performance comparison of techniques for water
[47] Q. Wang, Kernel Principal Component Analysis and Its Applications in Face demand forecasting, Procedia Comput. Sci. 143 (2018) 258–266.
Recognition and Active Shape Models, (2012). [66] J. Inoue, Y. Yamagata, Y. Chen, C.M. Poskitt, J. Sun, Anomaly detection for a water
[48] J. Adamowski, H. Fung Chan, S.O. Prasher, B. Ozga-Zielinski, A. Sliusarieva, treatment system using unsupervised machine learning, IEEE Int. Conf. Data Min.
Comparison of multiple linear and nonlinear regression, autoregressive integrated Work. ICDMW 2017 (November) (2017) 1058–1065.
12

1-s2.0-S2214714419316678-main

Uploaded by

Copyright:

Available Formats

1-s2.0-S2214714419316678-main

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

1-s2.0-S2214714419316678-main

Uploaded by

Copyright:

Available Formats

Journal of Water Process Engineering 33 (2020) 101081

Contents lists available at ScienceDirect

Journal of Water Process Engineering

Emerging evolutionary algorithm integrated with kernel principal T

Fig. 1. Schematic of the ELM model.

Fig. 2. A three-layered feedforward neural network with backpropagation algorithm.

Fig. 3. Structure of the proposed model development.

pHr 7.6813 11.5000 6.5000 1 2.3721 0.1956 0.2965 2.3721 0.2965

4. Results and discussions

The main motivation of this study is to explore the new data-driven

NC RMSE MAPE NC RMSE MAPE

pHt M1-FFNN 0.8705 0.0009 0.4241 0.8431 0.0009 0.4872

algorithm to determine the performance of the Tambura WTP and are

successfully in the analysis of environmental engineering problems References

You might also like