Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
23 views

C4B Machine Learning I

This document provides an overview of machine learning concepts and algorithms including k-nearest neighbors classification, regression, support vector machines, kernels, and logistic regression. It includes example problems and questions about implementing various machine learning techniques and evaluating their performance on sample datasets.

Uploaded by

Ai Ve Xu Viet
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

C4B Machine Learning I

This document provides an overview of machine learning concepts and algorithms including k-nearest neighbors classification, regression, support vector machines, kernels, and logistic regression. It includes example problems and questions about implementing various machine learning techniques and evaluating their performance on sample datasets.

Uploaded by

Ai Ve Xu Viet
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

C4B

Machine Learning I
A. Zisserman, Hilary Term 2009

1. Given the following training data for a {0, 1} binary classier: (x1 = 0.8; y1 = 1); (x2 = 0.4; y2 = 0); (x3 = 0.6; y3 = 1) Determine the output of a K Nearest Neighbour (K-NN) classier for all points on the interval 0 x 1 using (a) 1-NN (b) 3-NN 2. A regressor algorithm is dened using the mean of the K Nearest Neighbours of a test point. Determine the ouput on the interval 0 x 1 using the training data in question (1) for K=2. 3. Two students are working on a machine-learning approach to spam detection. Each student has their own set of 100 labeled emails, 90% of which are used for training and 10% for validating the model. Student A runs a KNN classication algorithm and reports 80% accuracy on her validation set. Student B experiments with over 100 different learning algorithms, training each one on his training set, and recording the accuracy on the validation set. His best formulation achieves 90% accuracy. Whose algorithm would you pick for protecting a corporate network from spam? Why? 4. Are the following sets of points linearly separable: (a) S1 : S2 : (b) S1 : S2 : (1, 2, 0) , (2, 4, 0), (3, 1, 0) (2, 4, 1) , (1, 5, 1), (5, 0, 1) (1, 2) , (2, 4) , (3, 1) (2, 4) , (1, 5) , (5, 0)

(c) Describe how the convex hulls of the sets can be used to determine if they are linearly separable. 5. For a linear SVM, f (x) = w x + b, show: (a) That the value of b can be computed from w and one support vector. (b) That the vector w in the primal cost function can be expressed as N i i xi , where xi are the training data. N (Hint, start by expressing w = i i xi + w , where w is orthogonal to xi i). 6. Show that if k1 (x, x ) and k2 (x, x ) are both valid kernels, then so is k1 (x, x ) + k2 (x, x ). (Hint, start from the properties of the Gram matrix Ki associated with ki (x, x )). 7. K-NN can be used in a transformed feature space x (x) using the Kernel trick. By expanding ||x x ||2 show that the distance between two points can be written in terms of kernels. 8. (a) Show that for the logistic sigmoid function d (z ) = (1 ) dz (b) The negative log-likelihood for logistic regression training is
n

L(w) =
i

yi log (w xi ) + (1 yi ) log(1 (w xi ))

Show that its gradient has the simple form: dL = dw


n

(yi (w xi ))xi
i

You might also like