Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
56 views

Assignment 4

This document contains an assignment on machine learning with multiple choice questions. It includes questions on linear separability after basis expansion, maximum possible support vectors after removing a training point, valid kernel functions, identifying support vectors using a polynomial kernel, effect of kernel on margin, training linear and SVM classifiers on a modified iris dataset to compare accuracies using different hyperparameters.

Uploaded by

Vijay
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views

Assignment 4

This document contains an assignment on machine learning with multiple choice questions. It includes questions on linear separability after basis expansion, maximum possible support vectors after removing a training point, valid kernel functions, identifying support vectors using a polynomial kernel, effect of kernel on margin, training linear and SVM classifiers on a modified iris dataset to compare accuracies using different hyperparameters.

Uploaded by

Vijay
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Assignment 4

Introduction to Machine Learning


Prof. B. Ravindran
1. Consider the 1 dimensional dataset:

x y
-1 1
0 -1
2 1

(Note: x is the feature, and y is the output)

State true or false: The dataset becomes


  linearly separable after using basis expansion with
1
the following basis function ϕ(x) = 3
x

(a) True
(b) False

Sol. (b)      
′ 1 ′ 1 ′ 1
After applying basis expansion, x1 = , x2 = , and x3 = . Plotting these, we can
−1 0 8
see that the data points are not linearly separable.
2. Consider a linear SVM trained with n labeled points in R2 without slack penalties and resulting
in k = 2 support vectors, where n > 100. By removing one labeled training point and retraining
the SVM classifier, what is the maximum possible number of support vectors in the resulting
solution?
(a) 1
(b) 2
(c) 3
(d) n − 1
(e) n
Sol. (d)
As discussed in the lecture, there is no maximum bound on the number of support vectors,
and the whole data set can be selected as support vectors. Thus, after removing one labeled
training point, the maximum number of support vectors will be n − 1.
3. Which of the following are valid kernel functions?

(a) (1+ < x, x >)d

(b) tanh(K1 < x, x > +K2 )

(c) exp(−γ||x − x ||2 )

1
Sol. (a), (b), (c)
Refer to the lecture.
4. (2 marks) Consider the following dataset:

x y
1 1
2 0
3 0
4 0
7 0
8 1
9 1
10 1

(Note: x is the feature and y is the output)

Which of these is not a support vector when using a Support Vector Classifier with a polynomial
kernel with degree = 3, C = 1, and gamma = 0.1?
(We recommend using sklearn to solve this question.)

(a) 3
(b) 1
(c) 9
(d) 10

Sol. (b)
The following code will give the support vectors
>>classif algo = SVC(C=1, kernel=’poly’,degree=3, gamma=0.1)
>>X=np.array([1., 2., 3., 4., 7., 8., 9., 10.])
>>X=X[:,None]
>>Y=np.array([1, 0, 0, 0, 0, 1, 1, 1])
>>classifier=classif algo.fit(X,Y)
print(classifier.support vectors )
5. Consider an SVM
 with a second order polynomial kernel. Kernel 1 maps each
 input
 data point
x 3x
x to K1 (x) = 2 . Kernel 2 maps each input data point x to K2 (x) = . Assume the
x 3x2
hyper-parameters are fixed. Which of the following option is true?
(a) The margin obtained using K2 (x) will be larger than the margin obtained using K1 (x).
(b) The margin obtained using K2 (x) will be smaller than the margin obtained using K1 (x).
(c) The margin obtained using K2 (x) will be the same as the margin obtained using K1 (x).
Sol. (a)
Since the value of K2 (x) is thrice that of K1 (x), all the distances in K2 (x)-space are thrice as
big as the distances in K1 (x)-space. This means that the margin (roughly the ”thickness of
the separating hyperplane that the SVM learns) is thrice as big also.

2
For Q6,7: Kindly download the modified version of Iris dataset from this link.
Available at: (https://goo.gl/vchhsd)
The dataset contains 150 points, and each input point has 4 features and belongs to one among
three classes. Use the first 100 points as the training data and the remaining 50 as test data.
In the following questions, to report accuracy, use the test dataset. You can round off the
accuracy value to the nearest 2-decimal point number. (Note: Do not change the order of data
points.)

6. (2 marks) Train a Linear perceptron classifier on the modified iris dataset. Report the best
classification accuracy for l1 and elasticnet penalty terms.
(We recommend using sklearn.)
(a) 0.82, 0.64
(b) 0.90, 0.71
(c) 0.84, 0.82
(d) 0.78, 0.64
Sol. (c)
The following code will give the desired result.
>>clf = Perceptron(penalty=”l1”).fit(X[0:100],Y[0:100])
>>clf.score(X[100:], Y[100:])
>>clf = Perceptron(penalty=”elasticnet”).fit(X[0:100],Y[0:100])
>>clf.score(X[100:], Y[100:])

7. (2 marks) Train an SVM classifier on the modified iris dataset. We encourage you to explore
the impact of varying different hyperparameters of the model. Specifically, try different kernels
and the associated hyperparameters. As part of the assignment, train models with the following
set of hyperparameters
poly, gamma = 0.4, one-vs-rest classifier, no-feature-normalization.
Try C = 0.1, 0.9, 10, 0.000001. For the above hyperparameters, report the best classification
accuracy.
(We recommend using sklearn.)
(a) 0.98
(b) 0.96
(c) 0.92
(d) 0.94
Sol. (b)
The following code will give the desired result.
>>clf = svm.SVC( C=10, kernel=’poly’, decision function shape=’ovr’, gamma = 0.4)).fit(X[0:100],
Y[0:100])
>>clf.score(X[100:], Y[100:])

You might also like