Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
66 views

Logistic Regression Ensemble For Predicting Custom

Logistic Regression Ensemble For Predicting Customer Review

Uploaded by

James Sarumaha
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
66 views

Logistic Regression Ensemble For Predicting Custom

Logistic Regression Ensemble For Predicting Customer Review

Uploaded by

James Sarumaha
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Available online at www.sciencedirect.

com

ScienceDirect
Procedia Computer Science 72 (2015) 86 – 93

The Third Information Systems International Conference

Logistic Regression Ensemble for Predicting Customer


Defection with Very Large Sample Size

Heri Kuswantoa*, Ayu Asfihanib,Yogi Sarumahab, Hayato Ohwadac


a,b
Department of Statistics, Institut Teknologi Sepuluh Nopember, Kampus ITS Sukolilo, Surabaya, 60111, Indonesia
c
Department of Industrial Administration, Graduate School of Science and Technology, Tokyo University of Science, Noda-Chiba
Japan

Abstract

Predicting customer defection is an important subject for companies producing cloud based software. The studied
company sell three products (High, Medium and Low Price), in which the consumer has choice to defect or retain the
product after certain period of time. The fact that the company collected very large dataset leads to inapplicability of
standard statistical models due to the curse of dimensionality. Parametric statistical models will tend to produce very
big standard error which may lead to inaccurate prediction results. This research examines a machine learning
approach developed for high dimensional data namely logistic regression ensemble (LORENS). Using computational
approaches, LORENS has prediction ability as good as standard logistic regression model i.e. between 66% to 77%
prediction accuracy. In this case, LORENS is preferable as it is more reliable and free of assumptions.

© 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license
© 2015 Published by Elsevier Ltd. Selection and/or peer-review under responsibility of the scientific
(http://creativecommons.org/licenses/by-nc-nd/4.0/).
committeeunder
Peer-review of The Third Information
responsibility of organizing Systems
committee International Conference
of Information Systems (ISICO
International 2015) (ISICO2015)
Conference

Keywords: ensemble; logistic regression; classification; high dimensional data, machine learning

1. Introduction

The annual growth of cloud software reaches about 36 percents in software market and it will be
continued until 2016 as predicted by Columbus [1]. Furthermore, the use of internet to collect a speed and
real time feedback from customers has produced big data which lead to some complexities in predicting
the customer behaviour. This situation happens in most cloud based software companies including
company “X” producing three kind of antivirus products. The customers are recognized to be defective
when they are stop to use any products, showed by termination of the contract.
Parametric statistical approaches involve statistical test and inference which usually require strict
assumptions. The approaches are commonly failed to be applied to high dimensional data (or even very

1877-0509 © 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/by-nc-nd/4.0/).
Peer-review under responsibility of organizing committee of Information Systems International Conference (ISICO2015)
doi:10.1016/j.procs.2015.12.108
Heri Kuswanto et al. / Procedia Computer Science 72 (2015) 86 – 93 87

large sample size data) due to the sensitivity of P-value. Lin et al. [2] showed that applying statistical test
to very large sample size data tends to reject the null hypothesis as the P-value will be extremely low.
Furthermore, computational approaches which are free of assumptions have been rapidly developed to
analyze big data. Lim [3] introduced LORENS for classification problem. The LORENS has been
developed by involving the Classification by Ensemble from Random Partitions (CERP) algorithm which
divide the variables into several subspaces. The method reassemble the logistic regression (LR) based
models from each partition into a single probability value used for classification. Lim et al. [4] as well as
Lee et al. [5] argued that LORENS is able to produce good classification accuracy. Its strength in the
classification is developed from the informative and representative characteristic of logistic regression as
well as the CERP characteristic leading to mutually exclusive of the deterministic variables.
Another challenge encountered by the company “X” data is significantly imbalance proportion
between defective and non-defective (retained) response. King and Zeng [6] showed that this kind of
imbalance response may cause bias of the estimated parameter of Binary Logistic Regression especially
when the estimation procedure is carried out by maximum likelihood estimation. In this case, the Hessian
matrix used in the estimation will be small and the estimated parameters are biased. Dealing with
imbalance response proportion, the use of 0.5 as a standard threshold for assigning the predictive
classification becomes unfair to the probability of each class. LORENS can overcome this problem by
proposing optimum threshold depending on the data characteristic.
This paper applies LORENS to predict the customer defection of cloud based software. Prasasti et al.
[7] used C4.5 and Support Vector Machine (SVM) to the same dataset used in this paper. Moreover,
Prasasti and Ohwada [8] used J48, MLP as well as SMO and obtained satisfactory classification results.
Those methods are popular machine learning approaches that are not designed specifically for high
dimensional data, while LORENS is originally developed for classification of high dimensional data. The
dataset in this paper not necessarily fits the definition of high dimensional data as the number of variable
is not greater than the sample size. However, it is worth to assess the performance and applicability of
LORENS to classify very large sample size data.

2. Literature Review

This section briefly describes about two methods applied in this paper i.e. Binary Logistic Regression
(BLR) and Logistic regression ensemble (LORENS).

2.1. Binary Logistic Regression (BLR)

Binary Logistic Regression is a method of data analysis used to find the relationship between the
variables response (y) that is binary or dichotomous with predictor variable (x) which is polycotomous or
continuous (see Hosmer and Lemeshow [9] for details). The Parameters of logistic regression are
estimated using maximum likelihood. Suppose that ‫ݔ‬௜ and ‫ݕ‬௜ are a pair of independent variable and
dependent variable from i-th observation and it is assumed that each observation is independent one
another, then the probability function for each pair can be expressed as follow .

݂ሺ‫ݔ‬௜ ሻ ൌ ߨሺ‫ݔ‬௜ ሻ௬೔ ሺͳ െ ߨሺ‫ݔ‬௜ ሻሻଵି௬೔ Ǣ‫ݕ‬௜ ൌ Ͳ‫ͳݎ݋‬


(1)
with,

൬σ ഁ ೣ ൰
௘ ೕసబ ೕ ೕ
ߨሺ‫ݔ‬௜ ሻ ൌ ೛  (2)
൬σ ഁ ೣ ൰
ଵା௘ ೕసబ ೕ ೕ

where if ݆ ൌ Ͳ then the value ‫ݔ‬௜௝ ൌ ‫ݔ‬௜଴ ൌ ͳ.


88 Heri Kuswanto et al. / Procedia Computer Science 72 (2015) 86 – 93

Each pair of observation is assumed to be independent, thus the likelihood function is a combination
of the each distribution function for each pair as follows :
ଵି௬೔
݈ሺߚሻ ൌ ς௡௜ୀଵ ݂ሺ‫ݔ‬௜ ሻ ൌ ς௡௜ୀଵ ߨሺ‫ݔ‬௜ ሻ௬೔ ൫ͳ െ ߨሺ‫ݔ‬௜ ሻ൯ (3)
The given likelihood function is easier to be maximized in the form of log ݈ሺߚሻ and the parameter ߚ can
be optimized using Newton Raphson from the first derivative ‫ܮ‬ሺߚሻ. The significance of the parameter can
be tested by Wald test with the following hypothesis:
‫ܪ‬଴ ǣ ߚ௜ ൌ Ͳ
‫ܪ‬ଵ ǣ ߚ௜ ് ͲǢ ݅ ൌ ͳǡʹǡ ǥ ǡ ‫݌‬
and the statistics test is defined as
ߚመ௜
ܹൌ 
ܵ‫ܧ‬൫ߚመ௜ ൯
The test statistics W (Wald statistic) follows the normal distribution with ‫ܪ‬଴ is rejected if ȁܹȁ ൐ ܼఈൗଶ.
Another important value in the logistic regression is the odds ratio. Details about interpretation of odd
ratio can be seen in Agresti [10]. If the number of class response is two, Table 1 depicts the predicted
classification and actual class.

Table 1. Cross tabulated classification of prediciton and actual class

Actual Class
’ሺ൅ሻ ሺെሻ
’ሺ൅ሻ True Positive (TP) False Positive (FP)
Predicted Class
ሺെሻ False Negative (FN) True Negative (TN)

Catal [11] defines the sensitivity as ܶܲȀሺܶܲ ൅ ‫ܰܨ‬ሻ and the specificity which is measured by ܶܰȀሺ‫ ܲܨ‬൅
ܶܰሻ, while the accuracy is measured from ሺܶܲ ൅ ܶܰሻȀሺܶܲ ൅ ‫ ܲܨ‬൅ ܶܰ ൅ ‫ܰܨ‬ሻ.

2.2. Logistic Regression Ensemble (LORENS)

By using the Logistic Regression algorithm Classification by Ensembles from Random Partitions
(LR_CERP), LORENS partitioned space predictor randomly into k-subspace in the same size. Because
the subspaces are randomly selected from the same distribution, it is assumed that there is no bias in the
selection of predictor variables in each subspace. In each subspace, the logistic regression model is
formed without applying variable selection. For one ensemble, LORENS combines predictive value
(probability value) of the produced logistic regression model for each partition to increase the accuracy.
The probability values from all models are averaged and classified as 0 or 1 with a certain threshold.
LORENS generates some ensembles with random varying partitions and then selects the highest value
among several ensembles. From these values, optimum accuracy level is determined by choosing the one
that significantly improved when ensemble number is increased or changed. Lim [3] showed that the
accuracy significantly increases when the number of ensemble is more than ten.
The normal threshold used in the classification for binary response in logistic regression is 0.5.
However, the classification accuracy will not be reliable if the proportion of class 1 and 0 is not equal. To
equalize the sensitivity and specificity, LORENS finds optimal threshold from the following formula,
where ‫ݕ‬ത is the average probability of observation lies in the positive class.
‫ݕ‬ത ൅ ͲǤͷ
Š”‡•Бކ ൌ
ʹ
To apply LORENS, either holdout or cross-validation method can be applied. The holdout procedure is
applied by taking a number of data for training and using the rest for testing. In this study, 10% of data
will be used as testing data and the remaining as training data. Meanwhile, cross validation divides the
sample into multiple partitions of k-folds or equally the same partition, each turn is used for testing and
Heri Kuswanto et al. / Procedia Computer Science 72 (2015) 86 – 93 89

the remaining is used for training, This procedure is repeated until all the partitions have been treated as a
testing set (Witten, et al. [12]). In summary, the steps of classification with LORENS using holdout
procedure can be described as follows:
- Do random partition of the variables into k-subspace predictor variables for one ensemble.
- Compose LR model for each subspace partition from data training.
- Obtain predicted value from each model for all observation from data testing.
- Calculate the average of all predicted values for each observation.
- Repeat the steps above to form n ensemble.
- Search the highest predicted value for each observation between all ensemble.
- Calculate the optimal threshold value.
- Classify the observation

3. Data and Variables

The data used in this research is secondary data that has been pre-processed by Prasasti et al. [7]. The
data was taken from the e-commerce website of company 'X' from 2007 until 2013 consisting the records
of consumer activities with consumer observation unit. Data is distinguished by its price, which are the
Low Price, Price Medium, and High Price products. The unit of observation in the data is company 'X's
consumers with sample size of 500000 consumers for Low Price product, 408810 for the Medium Price
product, and 709899 consumers High Price product. Below are the variables from the research of
Martono et al. [13] used in this research.
- Accumulation update (ܺଵ )
Accumulation update is the accumulation of updates that have been carried out by customer since
purchasing to renewal. Everytime the customers do the purchasing and renewal, thus the accumulation
update will be added by 1. The number of update ranges between 0 to 7 times
- Product price (ܺଶ )
Product price is the price of newly purchased products that range from 1886 to 39000 Japanese Yen
(JPY)
- Contract answer (ܺଷ )
Contract answer is the customer choice with value 1 for ‘opt-in’ (continue to use a certain product) or
0 for ‘opt-out’ (stop to use the product).
- Consumer type (ܺସ )
Consumer type is the type of consumer with the value 0 for individuals and 1 for organization
- Delivery status (ܺହ )
Delivery status is the status of the e-mail delivery with value 1 when it is sent and 0 if it is not sent.
- Customer defection (ܻ)
Consumer defection is consumer decision to defect or not defect (retain) the product where 1 if they
decide to defect and 0 if consumers continue to use one or more antivirus products from company ‘X’
even for different product.

4. Results and Discussion

4.1. Classification of customer defection using Binary Logistic Regression

This section is started by applying Binary Logistic Regression analysis to model Low Price customer
dataset. All variables have been entered in the model and it showed that product price and type of
consumer variables do not influence the customer defection significantly, showed by P-values that are
greater than significant level of 5 percent. A new model has been formed without considering both
insignificant variables. The best model obtained for Low Price customer data is as follow
90 Heri Kuswanto et al. / Procedia Computer Science 72 (2015) 86 – 93

௘ భǤఱరషబǤరయ೉భ షమǤవమ೉య షబǤఴఱ೉ఱ


.‫ݕ‬ሺ‫ݔ‬ሻ ൌ
ଵା௘ భǤఱరషబǤరయ೉భ షమǤవమ೉య షబǤఴఱ೉ఱ

This model is used as the basis for prediction with new observations. For this case, the result of
hypothesis testing is consistent with the obtained odd ratios, therefore we omitted the detail for the sake
of space. Now, the Binary Logistic Regression analysis is applied to Medium Price consumer data and the
following best regression model is obtained:

݁ ଶǤଽି଴Ǥ଻଴ହ௑భି଴Ǥ଴଴଴ଵଵ௑మିଷǤଷଶ଺௑యି଴Ǥ଴ହଶ଼௑రି଴Ǥଶ଴ଵ௑ఱ
‫ݕ‬ሺ‫ݔ‬ሻ ൌ
ͳ ൅ ݁ ଶǤଽି଴Ǥ଻଴ହ௑భି଴Ǥ଴଴଴ଵଵ௑మିଷǤଷଶ଺௑యି଴Ǥ଴ହଶ଼௑రି଴Ǥଶ଴ଵ௑ఱ

The hypothesis testing shows that all the variables included in the model above have significant effect on
the model with the values of odd ration listed in Table 2.

.Table 2. The coefficients and odds ratio of Medium Price customer data
Parameter Coefficient Odds Ratio
(Intercept) 2.9 18.24
Update Accumulation -0.7 0.49
Product Price -0.00011 0.99
Contract Answer -3.33 0.04
Consumer Type -0.053 0.95
Delivery Statuss -0.2 0.82

Table 2 reveals that the Binary Logistic Regression yields on misleading results i.e. product price and
consumer type variables have significant effect in the model but not in the magnitude of the odds ratio.
Similar result is obtained for High price product where all variables are significantly influence the
tendency of being defective or not, tested with P-value. However, the odd ratios for customer type and
product price variables are nearly zero. These misleading results are induced by the very large number of
dataset used to form the model leading to very small P-values. Moreover, the datasets are suffered from
imbalance proportion of class response. Similar results are obtained for High Price customer dataset.
Tables 3 shows the classification results generated from the models, while its accuracy, sensitivity
and specificity are performed in Table 4.

Table 3. Classification result with Binary Logistic Regression analysis

Actual Class
Low Price Medium Price High Price
‫݌‬ሺ൅ሻ ݊ሺെሻ ‫݌‬ሺ൅ሻ ݊ሺെሻ ‫݌‬ሺ൅ሻ ݊ሺെሻ
‫݌‬ሺ൅ሻ 22878 10299 24467 6432 30337 14113
Prediction class
݊ሺെሻ 6433 10391 2842 7141 8307 18234

Table 4. Classification accuracy of Binary Logistic Regression analysis

Product Accuracy Sensitivity Specificity


Low Price 66.54 77.31 68.42
Medium Price 78.05 89.6 78.5
High Price 50.22 52.61 56.37

We see that the Binary Logistic model has poor performance to classify the High Price customer data.
Meanwhile, the accuracy levels of other tow cases are moderate. Again, The model is suffered from
misleading results for Medium and High Price data.

4.2. Classification of customer defection using LORENS with holdout


Heri Kuswanto et al. / Procedia Computer Science 72 (2015) 86 – 93 91

As an illustration, the LORENS analysis is applied to the Low Price data with 3 partitions and 10
ensembles. It means that 5 predictor variables will be allocated into 3 space partitions and the process will
be repeated up to10 ensembles as shown in Table 5.
Table 5. Allocation of Predictor Variables in 3 Partitions and 10 Ensembles

Ensemble
Variabel Prediktor
1 2 3 4 5 6 7 8 9 10
Update Accumulation 3 2 3 1 1 1 2 1 1 3
Product Price 1 3 1 1 3 2 3 1 2 3
Contract Answer 3 3 3 3 2 3 1 3 1 1
Consumer Type 1 1 1 3 3 1 1 2 3 1
Delivery Status 2 1 2 2 1 3 3 3 3 2

Table 5 informs us that first ensemble consists of 3 models (subspaces) in which product price and
consumer type are in model 1, update accumulation and contract answer are in model 3. Meanwhile,
model 2 consists only one predictor variable i.e. delivery status. From the specification in Table 5, logistic
regression models are formed in each space partition of the training data. Furthermore, the threshold value
is calculated and its value is 0.5431. It means that consumers with an average probability of greater than
0.5431 will be classified into defection class, and vice versa. Results of the classification will be
compared with the actual class in order to be able to calculate sensitivity and specificity. Applying the
same procedure as described above, LORENS analysis is performed to all data partition sizes of 1 to 5 as
well as the size threshold of 0.5 and optimum threshold. From the analysis we tabulated the accuracy,
sensitivity and specificity as performed in Table 6.
Table 6. LORENS classification accuracy analysis with holdout

Optimum Threshold
Product Partition 1 2 3 4 5
Acur. 66.54 66.54 66.25 65.04 65.00
Low Price Sens. 78.05 78.05 78.24 88.43 88.49
Spec. 50.22 50.22 49.27 31.90 31.73
Acur. 75.45 77.20 74.53 74.06 67.79
edium
Sens. 78.80 88.09 92.09 92.32 97.96
Price
Spec. 68.72 55.29 39.20 37.34 7.09
Acur. 69.04 67.88 67.78 67.79 67.73
High Price Sens. 76.98 78.85 78.96 78.97 78.98
Spec. 59.56 54.77 54.43 54.43 54.30
Threshold 0.5
Product Partition 1 2 3 4 5
Acur. 66.54 65.04 64.50 63.41 59.27
Low Price Sens. 78.05 88.09 89.55 95.09 98.66
Spec. 50.22 32.39 29.01 18.54 3.47
Acur. 77.32 69.60 66.75 66.87 66.83
Medium
Sens. 89.59 96.43 99.69 99.99 99.99
Price
Spec. 52.61 15.62 0.49 0.24 0.10
Acur. 68.42 67.78 67.74 65.58 65.95
High
Sens. 78.50 78.96 78.98 88.43 89.61
Price
Spec. 56.37 54.43 54.32 38.29 37.68

In the analysis, the optimum partition is selected when the addition of one predictor into the model
could increase the classification accuracy most significantly. From the LORENS with holdout the size of
optimum partition for Low Price data is 3 partitions, for Medium Price is 4 partitions and for High Price
is 1 partition. If we compare accuracy of using optimum threshold and standard threshold in Table 6,
analysis with optimum threshold outperforms the analysis of using threshold equal to 0.5
92 Heri Kuswanto et al. / Procedia Computer Science 72 (2015) 86 – 93

4.3. Classification of customer defection using LORENS with cross validation

The LORENS analysis proposed to use cross validation method in the classification steps. This
method treats all observations equally in terms of the position as a training set and testing set. In this case,
LORENS with cross validation is incomparable with Binary Logistic Regression and LORENS with
holdout because the training and testing dataset involved in the analysis are different.
Suppose that the High Price dataset were analyzed using the size of the partition 2 and 10 ensembles
with 10 folds. Predictor variables are allocated to the partition space and the training data is used to
construct models. In the first fold, the predictor variables are substituted into the model for each partition
space. Probability values resulted from the two partitions on the same ensemble are averaged, which then
compared with the optimum threshold value to predict the class. Using LORENS, different threshold is
obtained from different folds as the training dataset are also different. Table 7 performs the optimum
threshold values of analysing High Pricedata with 2 partitions for each fold
Table 7. Optimum threshold for different each fold

Fold Optimum Threshold Fold Optimum Threshold


1 0.522120 6 0.522233
2 0.522158 7 0.522117
3 0.522007 8 0.522003
4 0.522338 9 0.522365
5 0.522257 10 0.522257

After applying the complete procedure of LORENS with 10 ensembles including the majority voting
steps, we obtained the accuracy, sensitivity and specificity as listed in Table 8.
Table 8. LORENS classification accuracy analysis with cross validation

Optimum Threshold
Product Partition 1 2 3 4 5
Accuracy 66.34 66.34 66.04 65.09 65.04
Low Price Sensitivity 77.69 77.70 77.99 88.22 88.35
Specificity 50.26 50.24 49.12 32.31 32.01
Accuracy 75.16 76.81 74.50 73.97 67.76
Medium Price Sensitivity 78.45 88.04 91.97 92.20 97.80
Specificity 68.54 54.23 39.35 37.29 7.32
Accuracy 69.21 68.08 68.03 68.03 72.78
High Price Sensitivity 76.87 78.69 78.85 78.86 78.73
Specificity 60.06 55.40 55.10 55.10 67.66
Threshold 0.5
Product Partition 1 2 3 4 5
Accuracy 66.34 65.08 64.70 63.56 59.33
Low Price Sensitivity 77.69 88.29 89.16 95.01 98.63
Specificity 50.26 32.20 30.04 19.01 3.65
Accuracy 77.19 73.61 66.76 66.88 66.84
Medium Price Sensitivity 89.31 92.56 99.66 99.98 99.99
Specificity 52.82 35.49 0.57 0.27 0.15
Accuracy 68.70 68.03 68.00 65.26 65.98
High Price Sensitivity 78.41 78.85 78.88 86.01 89.50
Specificity 57.09 55.10 55.00 40.48 37.88

Similar to the holdout method, the optimum partition is selected when adding the variables in the
model can improve the accuracy significantly. From the analysis of LORENS with Cross Validation,
optimum partition for data Low Price is partition with size of 3, for data Medium Price is 4 partitions, and
for the data High Price is 1 partition. Classification with optimum threshold still yields on better
classification results.
Heri Kuswanto et al. / Procedia Computer Science 72 (2015) 86 – 93 93

4.4. Best model selection

Table 9 below summarizes the results of the classification accuracy using Binary Logistic Regression
and LORENS with optimum partition and optimum threshold
Tabel 9 Comparison the accuracy of classification using BLR and LORENS

Product Method Accur. Sens. Spec.


Low Price BLR 66.54 78.05 50.22
LORENS 66.25 78.24 49.27
Medium Price BLR 77.32 89.59 52.61
LORENS-Holdout 74.06 92.32 37.34
High Price BLR 68.42 78.50 56.37
LORENS-Holdout 69.04 76.98 59.56

We can see that the classification accuracy of Binary Logistic Regression is slightly greater than
LORENS especially for Medium Price. Moreover, LORENS outperforms the Logistic Regression for
High Price data and similar result is obtained for Low Price data. Having the fact that there is misleading
result in the Binary Logistic Regression, thus the accuracy generated from this method is also
questionable, while the accuracy generated from LORENS is valid due to the fact that this method is free
of assumption.

5. Conclusion

This paper has successfully applied LORENS to classify cases where the sample size is very large.
Although LORES was originally developed for high dimensional data in the sense that the number of
variables exceeds the sample size, this paper shows that LORENS is still capable to be applied for limited
number of variable but large sample size. The analysis clearly showed that standard logistic regression
model fails to generate consistent results between P-value test and odd ratio. The LORENS is also
outperforms the logistic regression for some cases. To deal with the threshold choice, LORENS offers a
fair way to set the threshold. It has been shown also that using optimum threshold in LORENS yields on
better classification results than using threshold of 0.5.

References

[1] Colombus, L. Predicting enterprise cloud computing growth. Forbes (April 9, 2013), available at http://www.forbes.com/
sites/louiscolumbus/2013/09/04/predicting-enterprise-cloud-computing-growth/ (accessed on July 10, 2014).
[2] Lin, M., Lucas, H.C.Jr and Shmueli, G. Too big to fail: large samples and the p-value problem. Information System
Research, Article in Advance, 1-12; 2013
[3] Lim,N. Classification by ensembles from random partitions using logistic models. PhD thesis, Stony Brook University;2007.
[4] Lim, N., Ahn, H., Moon, H., Chen, J. J. Classification of High Dimensional Data with Ensemble of Logistic Regression
Models. Journal of Biopharma-ceutical Statistics 20:160-17; 2010.
[5] Lee, K., Ahn, H., Moon, H., Kodell, R.L., & Chen, J.J. Multinomial Logistic Regression Ensembles. Biopharm Stat, 23(3),
681-94; 2013.
[6] King, G. and Zeng, L. Logistic regression in rare events data. Society for Political Methodology WV006-01; 2001.
[7] Prasasti, N., Okada, M., Kanamori, K. And Ohwada, H. Customer lifetime value and defection possibility prediction model
using machine learning: an application to a cloud-based software company. Lecture Notes in Customer Science, 8399; 2013.
[8] Prasasti, N. and Ohwada, H. Applicability of machine-learning techniques in predicting customer Defection.In: International
Symposium on Technology Management and Emerging Technologies (ISTMET); 2014.
[9] Hosmer, D.W. and Lemeshow, S. Applied logistic regression, Second Edition. New York: John Wiley & Sons, Inc.; 2000.
[10] Catal, C. Performance evaluation metrics for software fault prediction Studies.Acta Polytechnica Hungarica; 2012, 9 ( 4).
[11] Agresti, A. Categorical data analysis. New York: John Wiley & Sons, Inc; 2002.
[12] Witten, I. H., Frank. E., Hall. M. A. Data mining: practical machine learning tools and techniques 3 rd Edition. Burlington:
Morgan Kaufmann; 2001.
[13] Martono, N.P., Kanamori, K. and Ohwada, H. Utilizing customer’s purchase and contract renewal details to predict
defection in the cloud software industry. Springer International Publishing Switzerland: PKAW 2014, LNCS 8863; 2014,
138–149.

You might also like