Machine Learning and Data Mining: Prof. Alexander Ihler

+
Machine Learning and Data Mining
Nearest neighbor methods
Prof. Alexander Ihler

Supervised learning
Notation
Features x
Targets y
Predictions Learning algorithm
Parameters
Change
Program (Learner)
Improve performance
Characterized by
Training data some parameters
(examples)
Procedure (using )
Features that outputs a prediction
Feedback /
Target values Score performance
(cost function)
Regression; Scatter plots
40
Target y
y(new) =?
20
x(new)
0
0 10 20
Feature x
Suggests a relationship between x and y

Regression: given new observed x(new), estimate y(new)
Nearest neighbor regression
Predictor:
40
Given new features:
Find nearest example
Target y
y(new) =? Return its value
20
x(new)
0
0 10 20
Feature x
Find training datum x(i) closest to x(new); predict y(i)

Nearest neighbor regression
Predictor:
40
Given new features:
Target y
Return its value
20
0
0 10 20
Feature x
Find training datum x(i) closest to x(new); predict y(i)

Defines an (implict) function f(x)
Form is piecewise constant
Nearest neighbor classifier
Predictor:
Given new features:
1 Return its value
0
1
X2 !
1 0
1 ?
0
X1 !
Predictor:
Given new features:
1 Return its value
0
1
X2 !
1 0
1 ?
0 training x?
Closest
Typically Euclidean distance:
0
X1 !
All points where we decide 1 Decision Boundary

1
0
1
X2 !
1 0
1 ?
0
0
All points where we decide 0
X1 !
Voronoi tessellation:
Each datum is
assigned to a region, in
1 which all points are
closer to it than any
other datum
0
1
Decision boundary:
X2 !
1 0 Those edges across

which the decision
1 (class of nearest
? 0 training datum)
changes
0
X1 !
Nearest Nbr:
Class 1 1 Piecewise linear boundary
0
1
X2 !
Class 0
1
0
0
X1 !
More Data Points
1
1 1
2
2
1 1 2
2
X2 !
1
1 2
1
2 2
2
X1 !
More Complex Decision Boundary
1 In general:
Nearest-neighbor classifier
1 1
2 produces piecewise linear
decision boundaries
2
1 1 2
2
X2 !
1
1 2
1
2 2
2
X1 !
+
Machine Learning and Data Mining
Nearest neighbor methods:

K-Nearest Neighbors
Prof. Alexander Ihler

K-Nearest Neighbor (kNN) Classifier
Find the k-nearest neighbors to x in the data
i.e., rank the feature vectors according to Euclidean distance
select the k vectors which are have smallest distance to x
Regression
Usually just average the y-values of the k closest training examples
Classification
ranking yields k feature vectors and a set of k class labels
pick the class label which is most common in this set (vote)
classify x as belonging to this class
Note: for two-class problems, if k is odd (k=1, 3, 5, ) there will never
be any ties
Training is trivial: just use training data as a lookup table, and

search to classify a new datum
kNN Decision Boundary
Piecewise linear decision boundary
Increasing k simplifies decision boundary
Majority voting means less emphasis on individual points
K=1 K=3
Recall: piecewise linear decision boundary
K=5 K=7
Recall: piecewise linear decision boundary
K = 25
Error rates and K
Predictive
Error
Error on Test Data
Error on Training Data
K (# neighbors)
K=1? Zero error!

Training data have been memorized...
Best value of K
Complexity & Overfitting
Complex model predicts all training points well
Doesnt generalize to new data points
K=1 : perfect memorization of examples (complex)
K=M : always predict majority class in dataset (simple)
Can select K using validation data, etc.
simpler
Too complex
K (# neighbors)
K-Nearest Neighbor (kNN) Classifier
Theoretical Considerations
as k increases
we are averaging over more neighbors
the effective decision boundary is more smooth
as N increases, the optimal k value tends to increase
k=1, m increasing to infinity : error < 2x optimal
Extensions of the Nearest Neighbor classifier

weighted distances
e.g., if some of the features are more important
e.g., if features are irrelevant
fast search techniques (indexing) to find k-nearest neighbors in d-space

Summary
K-nearest neighbor models
Classification (vote)
Regression (average or weighted average)
Piecewise linear decision boundary

How to calculate
Test data and overfitting

Model complexity for knn
Use validation data to estimate test error rates & select k

Machine Learning and Data Mining: Prof. Alexander Ihler

Uploaded by

Copyright:

Available Formats

Machine Learning and Data Mining: Prof. Alexander Ihler

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Machine Learning and Data Mining: Prof. Alexander Ihler

Uploaded by

Copyright:

Available Formats

+

Machine Learning and Data Mining

Nearest neighbor methods

Prof. Alexander Ihler

Suggests a relationship between x and y

y(new) =? Return its value

Find training datum x(i) closest to x(new); predict y(i)

Return its value

Find training datum x(i) closest to x(new); predict y(i)

All points where we decide 1 Decision Boundary

1 0 Those edges across

Machine Learning and Data Mining

Nearest neighbor methods:

Prof. Alexander Ihler

Training is trivial: just use training data as a lookup table, and

Error on Test Data

Error on Training Data

K=1? Zero error!

Extensions of the Nearest Neighbor classifier

fast search techniques (indexing) to find k-nearest neighbors in d-space

Piecewise linear decision boundary

Test data and overfitting

You might also like