20180723161729D4730 - Pert18 - K-Nearest Neighbor
20180723161729D4730 - Pert18 - K-Nearest Neighbor
20180723161729D4730 - Pert18 - K-Nearest Neighbor
K-Nearest Neighbor
Session 18
1
Learning Outcomes
At the end of this session, students will be able to:
2
Outline
1. Non-Parametric Model
2. K-Nearest Neighbor
3. Distance Metric
3
Non-Parametric Model
• Parametric model
• Non-parametric model
– K-nearest neighbors
4
Non-Parametric Model
• For example, suppose that each hypothesis we generate
simply retains within itself all of the training examples and uses
all of them to predict the next example
– Table lookup
5
Table Lookup
• Take all the training examples, put them in a lookup table,
and then when asked for h(x), see if x is in the table
6
K-nearest Neighbors
• We can improve on table lookup with a slight variation: given a
query xq, find the k examples that are nearest to xq.
7
K-nearest Neighbors
• To do classification, first find NN(k, xq), then take the plurality
vote of the neighbors (which is the majority vote in the case of
binary classification)
8
K-nearest Neighbors
• We can improve on table lookup with a slight variation: given a
query xq, find the k examples that are nearest to xq.
9
Decision Boundary of k=1
Might be overfitting
10
Decision Boundary of k=5
11
Distance Metric
• The very word “nearest” implies a distance metric
• P = 2 Euclidean distance
• P = 1 Manhattan distance
12
Distance Metric
• Euclidean distance is used if the dimensions are measuring
similar properties, such as the width, height and depth of
parts on a conveyor belt
13
Distance Metric
• Hamming distance?
• Mahalanobis distance?
14
Distance Metric
• If we use the raw numbers from each dimension then the total
distance will be affected by a change in scale in any dimension
15
Nearest Neighbor Classifier
• Assign label of nearest training data point to each test data
point
x
x
x o
x x
x
+ o
o x
x
o o+
o
o
x2
x1
17
1-nearest neighbor
x
x
x o
x x
x
+ o
o x
x
o o+
o
o
x2
x1
3-nearest neighbor
x
x
x o
x x
x
+ o
o x
x
o o+
o
o
x2
x1
5-nearest neighbor
x
x
x o
x x
x
+ o
o x
x
o o+
o
o
x2
x1
Using K-NN
• https://www.cc.gatech.edu/~hays/compvision/lectures/17.pdf
22