Solution 1

Question 2.
Describe a situation or problem from your job, everyday life, current events, etc., for which a
classification model would be appropriate. List some (up to 5) predictors that you might use.
Solution 2.1:
Problem Statement: A B2B company selling supply chain software to other
businesses is facing significantly more than 10% yearly churn. Now although the
focus is on getting new clients to increase revenue, one major target is to keep
existing customers focused on keeping churn below 10%.
Solution:
1. Phase 1. Classify existing customers if they will churn or not in near future.
2. Phase 2. Predict for each churning customer which of the predictor variable is contributing most.
So that the relationship managers can address that specific problem and stop the customer from
churning.
Model: Classification model
Predictor variables:
1. Number of Products bought by Customer (indicates customer stickiness).

2. Number of Contracts of Customer (indicates customer stickiness).
3. Business ($) given by Customer in terms of ARR (indicates affordability
and customer stickiness)
4. Customer activity on platform: Number of transactions (usability of
product for customer)
5. %increase or %decrease in customer activity on platform over last 3
months (shows changing customer preference)
6. Number of trading partners (shows deeply integrated the product is with
customer’s customers)
Question 2.2
The files credit_card_data.txt (without headers) and credit_card_data-headers.txt

(with headers) contain a dataset with 654 data points, 6 continuous and 4 binary predictor variables. It
has anonymized credit card applications with a binary response variable (last column) indicating if the
application was positive or negative. The dataset is the “Credit Approval Data Set” from the UCI Machine
Learning Repository (https://archive.ics.uci.edu/ml/datasets/Credit+Approval) without the categorical
variables and without data points that have missing values.
1. Using the support vector machine function ksvm contained in the R package kernlab, find a
good classifier for this data. Show the equation of your classifier, and how well it classifies the
data points in the full data set. (Don’t worry about test/validation data yet; we’ll cover that
topic soon.)
Notes on ksvm
 You can use scaled=TRUE to get ksvm to scale the data as part of calculating a
classifier.
 The term λ we used in the SVM lesson to trade off the two components of correctness
and margin is called C in ksvm. One of the challenges of this homework is to find a
value of C that works well; for many values of C, almost all predictions will be “yes” or
almost all predictions will be “no”.
 ksvm does not directly return the coefficients a0 and a1…am. Instead, you need to do
the last step of the calculation yourself. Here’s an example of the steps to take
(assuming your data is stored in a matrix called data):1
# call ksvm. Vanilladot is a simple linear kernel.

model <- ksvm(data[,1:10],data[,11],type=”C-
svc”,kernel=”vanilladot”,C=100,scaled=TRUE)
# calculate a1…am
a <- colSums(model@xmatrix[[1]] * model@coef[[1]])
a
# calculate a0
a0 <- –model@b
a0
# see what the model predicts
pred <- predict(model,data[,1:10])
pred
# see what fraction of the model’s predictions match the
actual classification
sum(pred == data[,11]) / nrow(data)
Hint: You might want to view the predictions your model makes; if C is too large or too small,
they’ll almost all be the same (all zero or all one) and the predictive value of the model will be
poor. Even finding the right order of magnitude for C might take a little trial-and-error.
Note: If you get the error “Error in vanilladot(length = 4, lambda = 0.5) :

unused arguments (length = 4, lambda = 0.5)”, it means you need to convert
data into matrix format:
model <-
ksvm(as.matrix(data[,1:10]),as.factor(data[,11]),type=”C-
svc”,kernel=”vanilladot”,C=100,scaled=TRUE)
1
I know I said I wouldn’t give you exact R code to copy, because I want you to learn for yourself. In general, that’s
definitely true – but in this case, because it’s your first R assignment and because the ksvm function leaves you in
the middle of a mathematical calculation that we haven’t gotten into in this course, I’m giving you the code.
Solution 2.2.1:
Equation of my classifier: R1 = −0.0010065348×A1 + −0.0011729048×A2 + −0.0016261967×A3
+ 0.0030064203×A8 + 1.0049405641×A9 + −0.0028259432×A10 + 0.0002600295×A11 +
−0.0005349551×A12 + −0.0012283758×A14 + 0.1063633995×A15 + 0.08158492
Based on coefficients and Intercept predicted by SVM classifier (find code details in R file
attached)
For the most accurate model, I ran classifier with different values of C and as you can see the
effect of Regularization (C) on accuracy below:
The best accuracy of the model is 0.8639144 observed for C from 0.1 to
100.
1. As "C" increases from 1e-04 to 0.001, there is a significant jump in accuracy from 0.547 to
0.838. This suggests that increasing "C" initially helps improve the model's performance.
2. From "C" values of 0.001 to 0.01, there is a smaller increase in accuracy from 0.838 to
0.864.
3. After "C" reaches 0.01, increasing it further doesn't seem to lead to any significant
improvement in accuracy. The highest accuracy of 0.864 is achieved with "C" values from
0.01 to 100. This suggests that the model reaches its optimal performance with a
moderate level of regularization.
4. The accuracy starts to slightly drop when "C" is set to 1000 (accuracy of 0.862), which
might indicate that the model is starting to overfit.
Refer the screenshot below that lists various accuracies of SVM classifier (find code details in R
file attached) for different values of C.
2. You are welcome, but not required, to try other (nonlinear) kernels as well; we’re not covering
them in this course, but they can sometimes be useful and might provide better predictions than
vanilladot.
3. Using the k-nearest-neighbors classification function kknn contained in the R kknn package,
suggest a good value of k, and show how well it classifies that data points in the full data set.
Don’t forget to scale the data (scale=TRUE in kknn).
Notes on kknn
 You need to be a little careful. If you give it the whole data set to find the closest points
to i, it’ll use i itself (which is in the data set) as one of the nearest neighbors. A helpful
feature of R is the index –i, which means “all indices except i”. For example, data[-
i,] is all the data except for the ith data point. For our data file where the first 10
columns are predictors and the 11th column is the response, data[-i,11] is the
response for all but the ith data point, and data[-i,1:10] are the predictors for all
but the ith data point.
(There are other, easier ways to get around this problem, but I want you to get
practice doing some basic data manipulation and extraction, and maybe some looping
too.)
 Note that kknn will read the responses as continuous, and return the fraction of the k
closest responses that are 1 (rather than the most common response, 1 or 0).
Solution 2.2.3:
For the k-nearest-neighbors classification here's a summary of the accuracy values:
K = 1: Accuracy = 0.8149847
K = 2: Accuracy = 0.8149847
K = 3: Accuracy = 0.8149847
K = 4: Accuracy = 0.8149847
K = 5: Accuracy = 0.851682
K = 6: Accuracy = 0.8455657
K = 7: Accuracy = 0.8470948
K = 8: Accuracy = 0.8486239
K = 9: Accuracy = 0.8470948
K = 10: Accuracy = 0.8501529
K = 11: Accuracy = 0.851682
K = 12: Accuracy = 0.853211
K = 13: Accuracy = 0.851682
K = 14: Accuracy = 0.851682
K = 15: Accuracy = 0.853211
K = 16: Accuracy = 0.851682
K = 17: Accuracy = 0.851682
K = 18: Accuracy = 0.851682
K = 19: Accuracy = 0.8501529
K = 20: Accuracy = 0.8501529
Based on this output, the accuracy values show that the performance of the k-NN model
changes as you vary the value of k.
1. The accuracy initially remains steady at around 0.8149847 for small values of k (1 to 4).
When k is too small (e.g., 1 to 4), the model may be overly sensitive to noise in the data
and lead to poorer generalization.
2. Then, it starts to improve and reaches its peak around k = 12 and k = 15 with an accuracy
of approximately 0.853211.
3. After that, the accuracy stabilizes and doesn't improve much for higher k values.
Refer the screenshot below that lists various accuracies of KNN classifier (find details in R file
attached) for different values of K.

Solution 1

Uploaded by

Copyright:

Available Formats

Solution 1

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Solution 1

Uploaded by

Copyright:

Available Formats

Question 2.

Model: Classification model

1. Number of Products bought by Customer (indicates customer stickiness).

The files credit_card_data.txt (without headers) and credit_card_data-headers.txt

# call ksvm. Vanilladot is a simple linear kernel.

Note: If you get the error “Error in vanilladot(length = 4, lambda = 0.5) :

You might also like