Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
14 views6 pages

Rahman 2020

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 6

Fourth International Conference on Electronics, Communication and Aerospace Technology (ICECA-2020)

IEEE Xplore Part Number: CFP20J88-ART; ISBN: 978-1-7281-6387-1

MACHINE LEARNING BASED CUSTOMER


CHURN PREDICTION IN BANKING
Manas Rahman V Kumar
Department of Computer Science Department of Computer Science
Central University of Kerala Central University of Kerala
Periye, Kasaragod Periye, Kasaragod
manasrahmanpadiyoor@gmail.com vkumar@cukerala.ac.in

Abstract—The number of service providers are being increased effective use of marketing capital. Technology has been used to
very rapidly in every business. In these days, there is no shortage help businesses to retain a competitive edge [3]. Data mining
of options for customers in the banking sector when choosing techniques [4] are a commonly used information technology
where to put their money. As a result, customer churn and
engagement has become one of the top issues for most of the for the extraction of marketing expertise and further guidance
banks. In this paper, a method to predicts the customer churn for business decisions.
in a Bank, using machine learning techniques, which is a branch It is very easy for customers to switch from one organiza-
of artificial intelligence is proposed. The research promotes the tion(Bank) to another for a better service quality or price rates.
exploration of the likelihood of churn by analyzing customer Organizations are convinced that recruiting new customers is
behavior. The KNN, SVM, Decision Tree, and Random Forest
classifiers are used in this study. Also, some feature selection far more expensive and hard than keeping existing clients
methods have been done to find the more relevant features and to [5]. But delivering reliable service on time and in budget
verify system performance. The experimentation was conducted to customers while maintaining a good working partnership
on the churn modeling dataset from Kaggle. The results are with them is another significant challenge for them. They
compared to find an appropriate model with higher precision need to consider consumers and their needs to resolve these
and predictability. As a result, the use of the Random Forest
model after oversampling is better compared to other models in challenges. Among these, one of their primary emphasis will
terms of accuracy. be on client churn. Customer churn takes place when clients
Index Terms—Customer churn in Bank, k-Nearest Neighbor, or subscribers cease to engage incorporation with a company
Support Vector Machine, Decision Tree, Random Forest. or service. For any organization, winning business from new
clients means going via the sales pipeline, using their sales
I. I NTRODUCTION and marketing assets in the cycle. Customer retention, on the
The market is very dynamic and highly competitive nowa- other hand, is usually more budget-effective, because they
days. It is because of the availability of a large number have already gained the confidence and loyalty of current
of service providers. The challenges of service providers customers. So, the need for a system that can efficiently predict
are finding the changing customer behavior and their rising customer churn in the early stages is really important for
expectations. The raising aspirations of current generation any organization. This paper aims to build a framework that
consumers and their diverse demands for connectivity and can predict the client churn in the banking sector using some
innovative, personalized approaches are very distinct from Machine learning techniques [6].
previous generations of consumers. They are well educated
and better informed of emerging approaches. Such advanced II. L ITERATURE R EVIEW
knowledge has changed their purchasing behavior, resulting in The analysis of the client churn in banking is a really broad
a trend of ’analysis-paralysis’ over-analyzing the selling and area. In one of these studies, [7] pursue commercial bank client
purchase scenario, which ultimately helps them to improve churn prediction based on the SVM model. For this work,
their purchase decisions. Therefore, this is a big challenge for a Chinese commercial bank consumer dataset that contains
the new generation service providers to think of innovatively 50,000 customer information is selected. After preprocessing
to fulfil and add values to the customers. records, there are eventually 46,406 valid data records. Two
Corporations need to recognize their consumers. Liu and types of SVM model is selected: linear SVM and SVM
Shih [1] strengthen this argument by implying that increasing with radial basis kernel function. The predictive effect of
competitive pressures on organizations to develop innova- the classification models was greatly improved by the under-
tive marketing approaches, to meet consumer expectations sampling approach. Due to the lopsided features of the actual
and enhance loyalty and retention. Canning [2] argues that commercial bank client churn dataset, the SVM model can not
offering more to all is no longer a viable sales strategy, accurately predict churners and even the general assessment
and a market environment that continues to become more parameters can not calculate the predictive power of the
competitive needs an agenda that emphasizes on the most model. The findings show that the integration of the random

978-1-7281-6387-1/20/$31.00 ©2020 IEEE 1196

Authorized licensed use limited to: University of Cape Town. Downloaded on May 19,2021 at 00:58:44 UTC from IEEE Xplore. Restrictions apply.
Fourth International Conference on Electronics, Communication and Aerospace Technology (ICECA-2020)
IEEE Xplore Part Number: CFP20J88-ART; ISBN: 978-1-7281-6387-1

sampling approach with the SVM model can substantially TABLE I


increase predictive capacity and help commercial banks to P REVIOUS WORKS ON BANK CHURN PREDICTION
predict churners more precisely. But this study used a 1:10
Title of the
proportion for churners to non-churners. In 1:1, the result is Authors
work
Year Methodology Remarks
getting as 80.84% at maximum. This is the main drawback of
Prediction
this work. B. He, of customer
Chinese commercial
In another study [8], a scientific study of the use of data bank data, lesser
Y. Shi, attrition of
SVM accuracy in 1:1
mining in the extraction of information from repositories in the Q. Wan, commercial 2014
Classification ratio of churners to
and X. banks
banking sector is presented. The findings show that customers Zhao based on
non-churners,
who use more banking services (products) seem to be more 80.84% accuracy.
svm model
loyal, so the bank can concentrate on those customers who Predicting
use fewer than three products and sell them goods as per customer small Croatian bank
their needs. The database used consists of records on 1866 churn in data, neural network
A. Bilal Alyuda Neural
banking 2016 is relatively slow
customers on the date of the study. The research is based on Zeri
industry
network
and tedious,
one method of churn prediction using a Neural network within using neural 93.30% accuracy.
the software package Alyuda NeuroInteligence. Which divides networks
Data into three sets: training, validation, and testing set. Three
forms of characteristics are described in the data analysis
stage: the characteristics that to refuse, the characteristics that depth and maximum number of nodes. In both Random Forest
need, and the target characteristics to be measured. The model and GBM, the best results show that the best number of
picks several hidden layers in the network design process. trees was 200 trees. And GBM got better results than DT
After training the network, the results are; the CCR % of and RF. The result showed that the best AUC value was
validation is 93,959732. The study concluded that, because of 93.301% for XGBOOST on 180 trees. The models are tested
the high proportion of retirees in the total number of customers by installing a new dataset for various times and without any
(691/1886), the bank has very well-tailored programs for constructive marketing intervention, XGBOOST also provided
retirees, and the probability of competing is extremely small. the best results with 89% AUC. The study hypothesized that
The biggest downside of this work is that the neural network the resulting decrease could be attributed to the non-stationary
is relatively slow and tedious. Table I summarised the churn data model phenomenon, so the model needs to train per time.
prediction in the banking system using ’Chinese Commercial
bank data’ and ’data from a small Croatian bank’, and is also III. METHODOLOGY
shown the drawbacks of the existing works, to overcome these
This work aims to predict customer churn in a commercial
drawbacks, this work proposed an ML-based customer churn
bank as early as possible using efficient data mining methods.
prediction in banking system using dataset ’churn modeling
A diagrammatic representation of the proposed model is given
data’.
in Fig 1.
The study [9] proposed a churn analysis model that helps
telecommunication operators to predict customers that are A. Dataset description
most likely to get churned. The system uses machine learning The dataset used in this analysis was obtained from Kaggle
strategies on a big data platform. The Area Under Curve to model churns. The dataset includes information of 10000
(AUC) standard measure is used to assess the efficiency of bank clients, and the target parameter is a binary variable
the model. The dataset used for the study was provided by that represents whether the customer has left the bank or
the Syriatel telecom company. The model has worked with still a customer. Of this, 7963 were positive class(maintained)
4 methodologies: Decision Tree, Random Forest, Gradient samples and 2037 were negative class(exited) samples. The
Boosted Machine Tree(GBM), and Extreme Gradient Boost- target variable reflects the binary flag 1 when the client has
ing(XGBOOST). The Hortonworks Data Platform (HDP) was a bank account closed, and 0 when the client is retained.
selected as the big data platform. Spark engines were used The dataset contains 13 feature vectors(predictors) that were
in almost all of the product’s phases such as data analysis, reported from customer data and transactions processed by the
function development, training, and software testing. The customer. The details of these features are given in Table II.
algorithm hyper-parameters were optimized with the aid of
K-fold cross-validation. Since the target class is unbalanced, B. Data Preprocessing
the sample for learning is rebalanced by taking a sample Preprocessing the data is a significant phase in the process
of data to balance the two classes. The study began with of data mining. Since they have a direct effect on task success
oversampling by multiplying the churn class to fit with the rate. It must deal with irrelevance, noisiness, and unreliability
other class. A random under-sampling approach was also used, of data. And if necessary, the data conversion too. Predictors
which decreases the sample size of the broad class to be descriptions after preprocessing are listed in Table III. These
compared with the second class. The training started on the are the attributes taken for deciding on churn prediction in this
Decision Tree algorithm and optimizing the hyper-parameter study.

978-1-7281-6387-1/20/$31.00 ©2020 IEEE 1197

Authorized licensed use limited to: University of Cape Town. Downloaded on May 19,2021 at 00:58:44 UTC from IEEE Xplore. Restrictions apply.
Fourth International Conference on Electronics, Communication and Aerospace Technology (ICECA-2020)
IEEE Xplore Part Number: CFP20J88-ART; ISBN: 978-1-7281-6387-1

Fig. 1. Activity diagram of the proposed system

TABLE II
DATASET DESCRIPTION

Feature Name Feature Description


Row number Row numbers from 1 to 10000.
Customer Id Unique Ids for bank customer identification.
Surname Customer’s last name.
Credit Score Credit score of the customer.
Geography The country from which the customer belongs.
Gender Male or Female.
Age Age of the customer.
Tenure Number of years for which the customer has been with the bank.
Balance Bank balance of the customer.
Num of Products Number of bank products the customer is utilizing(savings account, mobile banking, internet banking etc.).
Has Cr Card Binary flag for whether the customer holds a credit card with the bank or not.
Is Active Member Binary flag for whether the customer is an active member with the bank or not.
Estimated Salary Estimated salary of the customer in Dollars.
Exited Binary flag 1 if the customer closed account with bank and 0 if the customer is retained.

1) Irrelevancy: Data or features which have no impact • Gender: Female -> 0 and Male -> 1
on the topic of discussion shall be considered as irrelevant.
Keeping such attributes may sometimes affect the performance C. Feature Selection
of classifiers. While considering the churn dataset, the features In machine learning, the process of identifying a subset of
named Row number, Customer Id, Surname, and Geography appropriate predictors to be used in model building is called
has nothing to do with the prediction. So these features have feature selection. The selection phase of the feature is very
been neglected manually in this study. critical as it will help to shorten the training time, escape the
2) Transformation: Data transformation is the practice of high-dimensionality curse, and above all simplify the model.
turning data from one form into another. Properly structured 1) mRMR: Minimum Redundancy Maximum Relevance
and validated data enhance the quality of data and protect (mRMR) is one of the feature selection methods of filter type.
applications against possible minefields such as null values, For classification problems, it will rate features sequentially
unwanted duplicates, incorrect indexing, and incompatible using the mRMR algorithm [10]. The filter type feature
formats. In this work the following data transformation is selection algorithm evaluates feature significance based on
carried out; feature characteristics, such as the variance of feature and

978-1-7281-6387-1/20/$31.00 ©2020 IEEE 1198

Authorized licensed use limited to: University of Cape Town. Downloaded on May 19,2021 at 00:58:44 UTC from IEEE Xplore. Restrictions apply.
Fourth International Conference on Electronics, Communication and Aerospace Technology (ICECA-2020)
IEEE Xplore Part Number: CFP20J88-ART; ISBN: 978-1-7281-6387-1

TABLE III main objective is to find an efficient distinguishing hyperplane


P REPROCESSED DATASET DESCRIPTION that precisely categorizes data points and as far as possible,
Feature Name Min Max Mean SD and distinguishes the points of two classes by reducing the
Credit Score 350 850 650.5288 96.6533 possibility of misclassifying the training samples and unknown
Gender 0 1 0.5457 0.4979 test samples. This implies that there is the maximum distance
Age 18 92 38.9218 10.4878
Tenure 0 10 5.0128 2.8922
between two classes and the separating hyperplane. The Linear
Balance 0 250898.09 76485.8893 62397.4052 support vector machine(LSVM) model is used in this work.
Num Of Products 1 4 1.5302 0.5817 LSVM was originally developed to deal with binary class
Has Cr Card 0 1 0.7055 0.4558
Is Active Member 0 1 0.5151 0.4998
problems [19].
Estimated Salary 11.58 199992.48 100090.2399 57510.4928 3) Decision Tree(DT): A decision tree is a procedure that
slices a collection of data into various branch-like segments
[20]. A tree of decisions is easy to read. This advantage makes
reaction relevance. The selection of features will be part of the explanations for the model simple. While another algorithm
preprocessing phase of the data. Hence, the filter type feature (like a neural network) can generate a much more accurate
selection is uncorrelated with the training algorithm. model in a given scenario, a decision tree could be trained
2) Relief: It is also one of the filter type feature selection to predict the neural network’s predictions, thus opening up
that will rank features using the Relief algorithm [11]. This the neural network’s ”black box”. Another benefit is that, in
algorithm works best to estimate the significance of features the correlation between the target variables and the predictor
for distance-based supervised models that use pair distances variables it can model a high degree of nonlinearity. A decision
between observations to predict the response. What it is doing tree is composed of two major strategies [21]; Tree creation
is rank the predictors based on importance using the specified and Classification.
number of nearest neighbors. And the result will be the 4) Random Forest(RF): Breiman [22] presented RF as an
predictor numbers listed according to their ranking. ensemble classifier for tree learners. The method employs
several decision trees so that each tree relies on the values of an
D. Over Sampling individually selected random vector with the same distribution
In data processing, oversampling and undersampling are for all trees. Right choice for the tendency of decision trees to
strategies used to configure the class distribution of given data. overfit their training collection. In short, Random forests are
Since the data is highly imbalanced(7963 positive class sam- actually a way to combine many deep decision trees which
ples and 2037 negative class) and the size of the available data are learned on various sections of the same dataset with the
sample is small, this study will make use of the oversampling target of decreasing the variance. The real advantage of using
technique. Because if undersampling is preferred, the size of RF is it comes with quite high dimensional data, with no need
data will decrease in a way that enough data will not be there to perform dimensionality reduction and feature selection. The
to build the model. Hence, this study is using the random training rate is also higher and ease to use in parallel models.
oversampling by resampling the minority class(negative class).
IV. RESULTS AND DISCUSSIONS
E. Classification When the preprocessing of the data has been completed, the
The classification methods were applied over the prepro- data will be in the operational form. And the 10 features which
cessed data. KNN, SVM, Decision Tree(DT) and RF classifiers are obtained after preprocessing is taken for the remaining
are used for comparison of results. And also the comparison study. Among that, 70% of data will be used for training
of results of different classifiers has been carried out over the and the remaining 30% will use for testing as random. The
selected features by different feature selection methods. classifiers will be used alone and along with the specified
1) k-Nearest Neighbor (KNN): The KNN method is one feature selection methods. And each model is evaluated by the
of the easiest and most efficient non-parametric ways of accuracy which is obtained after a 10 fold cross-validation.
classification, based on supervised learning [12]. KNN works And the random confusion matrix was also produced for
by identifying the k nearest samples from an existing dataset each model. The performance of classifiers varies when using
and when a new unknown sample appears, classify the new different feature selection methods. The features selected in
sample in the most similar class. That is, the classification each feature selection method and the classifiers parameter
algorithm determines the test sample group by the k training details will describe in the following paragraphs.
samples that are the nearest neighbors to the test sample and For KNN, the k-value is set to 5. That is the nearest
assign it to the class with the highest likelihood. five neighbors are considered for classifying the new data.
2) Support Vector Machine(SVM): Support Vector Machine By reducing the neighbors than 5, sometimes the accuracy
is an efficient, supervised machine learning algorithm derived is increasing and wise versa. But, since the data is taking as
from Vapnik’s theory of statistical learning [13], [14], [15]. randomly for the classification it is not a good practice to
This has proved its success in the fields of classification select fewer neighbors. But when the number of neighbors is
[16], regression [17], time series prediction, and estimation greater than 5, the result is highly decreasing. Hence, the value
in geotechnical practice and mining science [18]. SVM’s of k is selected as 5(in which the accuracy and the change

978-1-7281-6387-1/20/$31.00 ©2020 IEEE 1199

Authorized licensed use limited to: University of Cape Town. Downloaded on May 19,2021 at 00:58:44 UTC from IEEE Xplore. Restrictions apply.
Fourth International Conference on Electronics, Communication and Aerospace Technology (ICECA-2020)
IEEE Xplore Part Number: CFP20J88-ART; ISBN: 978-1-7281-6387-1

is optimized). And the distance measure used is Euclidean TABLE V


distance. For SVM, the linear kernel function is used(LSVM). R ESULTS AFTER MRMR FEATURE SELECTION
In the case of RF, the number of trees in the forest is set as 100. Classifier Accuracy(%) Accuracy After oversampling(%)
All these parameters are selected based on the optimization of
KNN 83.97 82.57
classification accuracy.
SVM 79.63 69.96
The results of various classification techniques with and
DT 78.32 91.73
without oversampling(without feature selection) are given in
RF 83.66 92.95
table IV. It shows that the DT and RF classifiers accuracy
increased after oversampling, but there is no change in KNN
accuracy with regard to oversampling and SVM accuracy
TABLE VI
reduced with oversampling, this indicates that SVM is not R ESULTS AFTER R ELIEF F FEATURE SELECTION
suitable for huge amounts of data.
The best 6 features selected by MRMR method are Num- Classifier Accuracy(%) Accuracy After oversampling(%)
ber of Products, Is Active Member, Gender, Age, Balance, KNN 82.15 80.99
Tenure. The accuracies of the various classifiers using MRMR SVM 79.63 69.53
selection method are shown in the table V, the accuracy of DT 77.61 90.74
KNN is increased compared to KNN without MRMR. The RF 81.75 92.19
SVM accuracy is almost similar to SVM without MRMR, the
DT and RF accuracies are decreased a little bit compared to Products” is the feature that has higher significance in this
previous models. study. And as a conclusion, the people with more number of
The best features selected by Relief method are Number bank products like mobile banking, internet banking, savings
of Products, Age, Balance, Tenure, Gender, Has CrCard. The account, fixed deposits, etc., are less likely to be churned.
accuracies of the various classifiers are shown in table VI, Hence the bank needs to focus on the people who are using
the KNN accuracy is increased compared to KNN without fewer products.
feature selection and SVM remains the same, but DT and RF
accuracies decreased little bit compared to previous models. V. C ONCLUSION
While resampling the negative class samples using over- While the banking sector is considered, like any other
sampling(making negative class samples count the same as organization, customer engagement has become one of the
positive class), the data imbalance problem will be solved. primary concerns. To resolve this crisis, banks need to identify
Another finding is that resampling is decreasing the SVM customer churn possibilities as quickly as possible. There are
score. By resampling, the actual data size is increasing. Hence, various studies ongoing in banking churn prediction. Different
the SVM is unable to perform the required classification. entities measure the churn rate of customers in various ways
The KNN seems to maintain almost the same accuracy after using different bits of data or information. The need for a
resampling. But tree classifiers DT and RF, are increasing the system that can forecast the client churning in banking in a
accuracy, it is because the tree classifiers will improve the generalized way in the early stages is really important. The
accuracy when the amount of data is higher and balanced. system needs to works with fixed and potential data sources
While applying feature selection methods, the score of that are independent of any service provider. And also the
KNN in increasing by a little. SVM accuracy remains almost model must be in a form in which; can use minimal infor-
the same after feature selection also. But in DT and RF, the mation and can give maximum throughput for the prediction.
scores are decreasing a little. It is because the tree classifiers This study focus to fulfil these needs.
are handling each feature more reliably. Hence, the reduction The purpose of this study is to build the most appropriate
in features will effect this reliability and it will reduce the model to predict client churn in a Bank in the early stages. The
accuracy. In short, the RF after oversampling is giving higher study only used a small amount of data (10000 samples), and
accuracy than KNN, SVM, and DT in this study. And the also highly imbalanced. But real commercial bank data would
feature selection does not affect tree classifiers. Even if the be much larger. By oversampling, both of these headaches up
results are not improved after feature selection, feature ranking to a certain degree can be resolved. The model examined KNN,
is done. Among the features under consideration, the ”NumOf SVM, Decision Tree, RF classifiers under different conditions
for this study. A better result is achieved when using the RF
TABLE IV classifier together with oversampling(95.74%). Feature selec-
R ESULTS BY APPLYING CLASSIFIERS DIRECTLY tion methods have nothing to do with tree classifiers(Decision
Tree and RF). As the result indicates, feature reduction(feature
Classifier Accuracy(%) Accuracy After oversampling(%)
selection) is decreasing the prediction score of tree classifiers.
KNN 81.65 81.37 Another observation is that unlike other classifiers, in SVM,
SVM 79.63 70.36 oversampling is decreasing the score. It’s because the Bank
DT 78.99 91.98 dataset is lopsided. Hence, SVM unable to handle the data
RF 85.18 95.74 well enough.

978-1-7281-6387-1/20/$31.00 ©2020 IEEE 1200

Authorized licensed use limited to: University of Cape Town. Downloaded on May 19,2021 at 00:58:44 UTC from IEEE Xplore. Restrictions apply.
Fourth International Conference on Electronics, Communication and Aerospace Technology (ICECA-2020)
IEEE Xplore Part Number: CFP20J88-ART; ISBN: 978-1-7281-6387-1

R EFERENCES
[1] D.-R. Liu and Y.-Y. Shih, “Integrating ahp and data mining for product
recommendation based on customer lifetime value,” Information &
Management, vol. 42, no. 3, pp. 387–400, 2005.
[2] G. Canning Jr, “Do a value analysis of your customer base,” Industrial
Marketing Management, vol. 11, no. 2, pp. 89–93, 1982.
[3] R. W. Stone and D. J. Good, “The assimilation of computer-aided
marketing activities,” Information & management, vol. 38, no. 7, pp.
437–447, 2001.
[4] M.-S. Chen, J. Han, and P. S. Yu, “Data mining: an overview from
a database perspective,” IEEE Transactions on Knowledge and data
Engineering, vol. 8, no. 6, pp. 866–883, 1996.
[5] M.-K. Kim, M.-C. Park, and D.-H. Jeong, “The effects of customer
satisfaction and switching barrier on customer loyalty in korean mobile
telecommunication services,” Telecommunications policy, vol. 28, no. 2,
pp. 145–159, 2004.
[6] I. T. J. Swamidason, “Survey of data mining algorithm’s for intelligent
computing system,” Journal of Trends in Computer Science and Smart
Technology, vol. 01, pp. 14–23, 09 2019.
[7] B. He, Y. Shi, Q. Wan, and X. Zhao, “Prediction of customer attrition of
commercial banks based on svm model,” Procedia Computer Science,
vol. 31, pp. 423–430, 2014.
[8] A. Bilal Zorić, “Predicting customer churn in banking industry using
neural networks,” Interdisciplinary Description of Complex Systems:
INDECS, vol. 14, no. 2, pp. 116–124, 2016.
[9] A. K. Ahmad, A. Jafar, and K. Aljoumaa, “Customer churn prediction
in telecom using machine learning in big data platform,” Journal of Big
Data, vol. 6, no. 1, p. 28, 2019.
[10] Y. Jiang and C. Li, “mrmr-based feature selection for classification
of cotton foreign matter using hyperspectral imaging,” Computers and
Electronics in Agriculture, vol. 119, pp. 191–200, 2015.
[11] L. Beretta and A. Santaniello, “Implementing relieff filters to extract
meaningful features from genetic lifetime datasets,” Journal of biomed-
ical informatics, vol. 44, no. 2, pp. 361–369, 2011.
[12] R. O. Duda, P. E. Hart, and D. G. Stork, Pattern classification. John
Wiley & Sons, 2012.
[13] C. Cortes and V. Vapnik, “Support-vector networks,” Machine learning,
vol. 20, no. 3, pp. 273–297, 1995.
[14] V. Vapnik, The nature of statistical learning theory. Springer science
& business media, 2013.
[15] V. N. Vapnik, “An overview of statistical learning theory,” IEEE trans-
actions on neural networks, vol. 10, no. 5, pp. 988–999, 1999.
[16] J. Raj and V. Ananthi, “Recurrent neural networks and nonlinear predic-
tion in support vector machines,” Journal of Soft Computing Paradigm,
vol. 2019, pp. 33–40, 09 2019.
[17] P. G. Nieto, E. F. Combarro, J. del Coz Dı́az, and E. Montañés, “A svm-
based regression model to study the air quality at local scale in oviedo
urban area (northern spain): A case study,” Applied Mathematics and
Computation, vol. 219, no. 17, pp. 8923–8937, 2013.
[18] S.-G. Cao, Y.-B. Liu, and Y.-P. Wang, “A forecasting and forewarning
model for methane hazard in working face of coal mine based on ls-
svm,” Journal of China University of Mining and Technology, vol. 18,
no. 2, pp. 172–176, 2008.
[19] Y. Tang, “Deep learning using linear support vector machines,” arXiv
preprint arXiv:1306.0239, 2013.
[20] L. Breiman, J. Friedman, C. J. Stone, and R. A. Olshen, Classification
and regression trees. CRC press, 1984.
[21] N. B. Amor, S. Benferhat, and Z. Elouedi, “Qualitative classification
with possibilistic decision trees,” in Modern Information Processing.
Elsevier, 2006, pp. 159–169.
[22] L. Breiman, “Random forests,” Machine learning, vol. 45, no. 1, pp.
5–32, 2001.

978-1-7281-6387-1/20/$31.00 ©2020 IEEE 1201

Authorized licensed use limited to: University of Cape Town. Downloaded on May 19,2021 at 00:58:44 UTC from IEEE Xplore. Restrictions apply.

You might also like