Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
K-NEAREST NEIGHBOR CLASSIFIER
Ajay Krishna Teja Kavuri
ajkavuri@mix.wvu.edu
OUTLINE
• BACKGROUND
• DEFINITION
• K-NN IN ACTION
• K-NN PROPERTIES
• REMARKS
BACKGROUND
“Classification is a data mining technique used to predict group
membership for data instances.”
• The group membership is utilized in for the prediction of the
future data sets.
ORIGINS OF K-NN
• Nearest Neighbors have been used in statistical estimation and
pattern recognition already in the beginning of 1970’s (non-
parametric techniques).
• The method prevailed in several disciplines and still it is one
of the top 10 Data Mining algorithm.
MOST CITED PAPERS
K-NN has several variations that came out of optimizations
through research. Following are most cited publications:
• Approximate nearest neighbors: towards removing the curse of dimensionality
Piotr Indyk, Rajeev Motwani
• Nearest neighbor queries
Nick Roussopoulos, Stephen Kelley, Frédéric Vincent
• Machine learning in automated text categorization
Fabrizio Sebastiani
IN A SENTENCE K-NN IS…..
• It’s how people judge by observing our peers.
• We tend to move with people of
similar attributes so does data.
DEFINITION
• K-Nearest Neighbor is considered a lazy learning algorithm
that classifies data sets based on their similarity with
neighbors.
• “K” stands for number of data set items
that are considered for the classification.
Ex: Image shows classification for different k-values.
TECHNICALLY…..
• For the given attributes A={X1, X2….. XD} Where D is the
dimension of the data, we need to predict the corresponding
classification group G={Y1,Y2…Yn} using the proximity
metric over K items in D dimension that defines the closeness
of association such that X € RD and Yp € G.
THAT IS….
• Attribute A={Color, Outline, Dot}
• Classification Group,
G={triangle, square}
• D=3, we are free to choose K value.
Attributes A
C
l
a
s
s
i
f
i
c
a
t
i
o
n
G
r
o
u
p
PROXIMITY METRIC
• Definition: Also termed as “Similarity Measure” quantifies the
association among different items.
• Following is a table of measures for different data items:
Similarity Measure Data Format
Contingency Table, Jaccard coefficient, Distance Measure Binary
Z-Score, Min-Max Normalization, Distance Measures Numeric
Cosine Similarity, Dot Product Vectors
PROXIMITY METRIC
• For the numeric data let us consider some distance measures:
– Manhattan Distance:
– Ex: Given X = {1,2} & Y = {2,5}
Manhattan Distance = dist(X,Y) = |1-2|+|2-5|
= 1+3
= 4
PROXIMITY METRIC
- Euclidean Distance:
- Ex: Given X = {-2,2} & Y = {2,5}
Euclidean Distance = dist(X,Y) = [ (-2-2)^2 + (2-5)^2 ]^(1/2)
= dist(X,Y) = (16 + 9)^(1/2)
= dist(X,Y) = 5
K-NN IN ACTION
• Consider the following data:
A={weight,color}
G={Apple(A), Banana(B)}
• We need to predict the type of a
fruit with:
weight = 378
color = red
SOME PROCESSING….
• Assign color codes to convert into numerical data:
• Let’s label Apple as “A” and
Banana as “B”
PLOTTING
• Using K=3,
Our result will be,
AS K VARIES….
• Clearly, K has an impact on the classification.
Can you guess?
K-NN LIVE!!
• http://www.ai.mit.edu/courses/6.034b/KNN.html
K-NN VARIATIONS
• Weighted K-NN: Takes the weights associated with each
attribute. This can give priority among attributes.
Ex: For the data,
Weight:
Probability:
Where,
Above is the resulting dataset
K-NN VARIATIONS
• (K-l)-NN: Reduce complexity by having a threshold on the
majority. We could restrict the associations through (K-l)-NN.
Ex: Decide if majority is over a given
threshold l. Otherwise reject.
Here, K=5 and l=4. As there is no
majority with count>4. We reject
to classify the element.
K-NN PROPERTIES
• K-NN is a lazy algorithm
• The processing defers with respect to K value.
• Result is generated after analysis of stored data.
• It neglects any intermediate values.
REMARKS: FIRST THE GOOD
Advantages
• Can be applied to the data from any distribution
for example, data does not have to be separable with a linear
boundary
• Very simple and intuitive
• Good classification if the number of samples is large enough
NOW THE BAD….
Disadvantages
• Dependent on K Value
• Test stage is computationally expensive
• No training stage, all the work is done during the test stage
• This is actually the opposite of what we want. Usually we can
afford training step to take a long time, but we want fast test step
• Need large number of samples for accuracy
THANK YOU

More Related Content

What's hot

KNN
KNNKNN
Classification Algorithm.
Classification Algorithm.Classification Algorithm.
Classification Algorithm.
Megha Sharma
 
K nearest neighbor
K nearest neighborK nearest neighbor
K nearest neighbor
Ujjawal
 
Performance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsPerformance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning Algorithms
Kush Kulshrestha
 
K means clustering
K means clusteringK means clustering
K means clustering
keshav goyal
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision trees
Knoldus Inc.
 
Data preprocessing using Machine Learning
Data  preprocessing using Machine Learning Data  preprocessing using Machine Learning
Data preprocessing using Machine Learning
Gopal Sakarkar
 
Knn
KnnKnn
Decision Tree Learning
Decision Tree LearningDecision Tree Learning
Decision Tree Learning
Milind Gokhale
 
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
Simplilearn
 
Presentation on K-Means Clustering
Presentation on K-Means ClusteringPresentation on K-Means Clustering
Presentation on K-Means Clustering
Pabna University of Science & Technology
 
Knn Algorithm presentation
Knn Algorithm presentationKnn Algorithm presentation
Knn Algorithm presentation
RishavSharma112
 
K Nearest Neighbor Algorithm
K Nearest Neighbor AlgorithmK Nearest Neighbor Algorithm
K Nearest Neighbor Algorithm
Tharuka Vishwajith Sarathchandra
 
Unsupervised learning (clustering)
Unsupervised learning (clustering)Unsupervised learning (clustering)
Unsupervised learning (clustering)
Pravinkumar Landge
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reduction
mrizwan969
 
Machine Learning With Logistic Regression
Machine Learning  With Logistic RegressionMachine Learning  With Logistic Regression
Machine Learning With Logistic Regression
Knoldus Inc.
 
Hierarchical Clustering
Hierarchical ClusteringHierarchical Clustering
Hierarchical Clustering
Carlos Castillo (ChaTo)
 
K mean-clustering algorithm
K mean-clustering algorithmK mean-clustering algorithm
K mean-clustering algorithm
parry prabhu
 
Decision trees in Machine Learning
Decision trees in Machine Learning Decision trees in Machine Learning
Decision trees in Machine Learning
Mohammad Junaid Khan
 
Principal component analysis and lda
Principal component analysis and ldaPrincipal component analysis and lda
Principal component analysis and lda
Suresh Pokharel
 

What's hot (20)

KNN
KNNKNN
KNN
 
Classification Algorithm.
Classification Algorithm.Classification Algorithm.
Classification Algorithm.
 
K nearest neighbor
K nearest neighborK nearest neighbor
K nearest neighbor
 
Performance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsPerformance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning Algorithms
 
K means clustering
K means clusteringK means clustering
K means clustering
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision trees
 
Data preprocessing using Machine Learning
Data  preprocessing using Machine Learning Data  preprocessing using Machine Learning
Data preprocessing using Machine Learning
 
Knn
KnnKnn
Knn
 
Decision Tree Learning
Decision Tree LearningDecision Tree Learning
Decision Tree Learning
 
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
 
Presentation on K-Means Clustering
Presentation on K-Means ClusteringPresentation on K-Means Clustering
Presentation on K-Means Clustering
 
Knn Algorithm presentation
Knn Algorithm presentationKnn Algorithm presentation
Knn Algorithm presentation
 
K Nearest Neighbor Algorithm
K Nearest Neighbor AlgorithmK Nearest Neighbor Algorithm
K Nearest Neighbor Algorithm
 
Unsupervised learning (clustering)
Unsupervised learning (clustering)Unsupervised learning (clustering)
Unsupervised learning (clustering)
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reduction
 
Machine Learning With Logistic Regression
Machine Learning  With Logistic RegressionMachine Learning  With Logistic Regression
Machine Learning With Logistic Regression
 
Hierarchical Clustering
Hierarchical ClusteringHierarchical Clustering
Hierarchical Clustering
 
K mean-clustering algorithm
K mean-clustering algorithmK mean-clustering algorithm
K mean-clustering algorithm
 
Decision trees in Machine Learning
Decision trees in Machine Learning Decision trees in Machine Learning
Decision trees in Machine Learning
 
Principal component analysis and lda
Principal component analysis and ldaPrincipal component analysis and lda
Principal component analysis and lda
 

Viewers also liked

k Nearest Neighbor
k Nearest Neighbork Nearest Neighbor
k Nearest Neighbor
butest
 
Knn
KnnKnn
Kato Mivule: An Investigation of Data Privacy and Utility Preservation Using ...
Kato Mivule: An Investigation of Data Privacy and Utility Preservation Using ...Kato Mivule: An Investigation of Data Privacy and Utility Preservation Using ...
Kato Mivule: An Investigation of Data Privacy and Utility Preservation Using ...
Kato Mivule
 
Machine learning clisification algorthims
Machine learning clisification algorthimsMachine learning clisification algorthims
Machine learning clisification algorthims
Mohammed Abdalla Youssif
 
Nearest Neighbor Algorithm Zaffar Ahmed
Nearest Neighbor Algorithm  Zaffar AhmedNearest Neighbor Algorithm  Zaffar Ahmed
Nearest Neighbor Algorithm Zaffar Ahmed
Zaffar Ahmed Shaikh
 
ML KNN-ALGORITHM
ML KNN-ALGORITHMML KNN-ALGORITHM
ML KNN-ALGORITHM
Ateeq Ur Rehman
 

Viewers also liked (7)

k Nearest Neighbor
k Nearest Neighbork Nearest Neighbor
k Nearest Neighbor
 
Algorithme knn
Algorithme knnAlgorithme knn
Algorithme knn
 
Knn
KnnKnn
Knn
 
Kato Mivule: An Investigation of Data Privacy and Utility Preservation Using ...
Kato Mivule: An Investigation of Data Privacy and Utility Preservation Using ...Kato Mivule: An Investigation of Data Privacy and Utility Preservation Using ...
Kato Mivule: An Investigation of Data Privacy and Utility Preservation Using ...
 
Machine learning clisification algorthims
Machine learning clisification algorthimsMachine learning clisification algorthims
Machine learning clisification algorthims
 
Nearest Neighbor Algorithm Zaffar Ahmed
Nearest Neighbor Algorithm  Zaffar AhmedNearest Neighbor Algorithm  Zaffar Ahmed
Nearest Neighbor Algorithm Zaffar Ahmed
 
ML KNN-ALGORITHM
ML KNN-ALGORITHMML KNN-ALGORITHM
ML KNN-ALGORITHM
 

Similar to KNN

k-Nearest Neighbors with brief explanation.pptx
k-Nearest Neighbors with brief explanation.pptxk-Nearest Neighbors with brief explanation.pptx
k-Nearest Neighbors with brief explanation.pptx
gamingzonedead880
 
KNN Algorithm using C++
KNN Algorithm using C++KNN Algorithm using C++
KNN Algorithm using C++
Afraz Khan
 
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Maninda Edirisooriya
 
K nearest neighbours
K nearest neighboursK nearest neighbours
K nearest neighbours
Learnbay Datascience
 
Fa18_P2.pptx
Fa18_P2.pptxFa18_P2.pptx
Fa18_P2.pptx
Md Abul Hayat
 
Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional Managers
Albert Y. C. Chen
 
ch_5_dm clustering in data mining.......
ch_5_dm clustering in data mining.......ch_5_dm clustering in data mining.......
ch_5_dm clustering in data mining.......
PriyankaPatil919748
 
UNIT 3: Data Warehousing and Data Mining
UNIT 3: Data Warehousing and Data MiningUNIT 3: Data Warehousing and Data Mining
UNIT 3: Data Warehousing and Data Mining
Nandakumar P
 
26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt
vikassingh569137
 
Machine Learning with R
Machine Learning with RMachine Learning with R
Machine Learning with R
Barbara Fusinska
 
KNN presentation.pdf
KNN presentation.pdfKNN presentation.pdf
KNN presentation.pdf
AbhilashChauhan14
 
ML SFCSE.pptx
ML SFCSE.pptxML SFCSE.pptx
ML SFCSE.pptx
NIKHILGR3
 
Mini_Project
Mini_ProjectMini_Project
Mini_Project
Ashish Yadav
 
Clustering: A Scikit Learn Tutorial
Clustering: A Scikit Learn TutorialClustering: A Scikit Learn Tutorial
Clustering: A Scikit Learn Tutorial
Damian R. Mingle, MBA
 
Unsupervised learning clustering
Unsupervised learning clusteringUnsupervised learning clustering
Unsupervised learning clustering
Arshad Farhad
 
Sampling and Data_Update.ppt
Sampling and Data_Update.pptSampling and Data_Update.ppt
Sampling and Data_Update.ppt
MdShohelRana69
 
K-Nearest Neighbor(KNN)
K-Nearest Neighbor(KNN)K-Nearest Neighbor(KNN)
K-Nearest Neighbor(KNN)
Abdullah al Mamun
 
Classification_Algorithms_Student_Data_Presentation
Classification_Algorithms_Student_Data_PresentationClassification_Algorithms_Student_Data_Presentation
Classification_Algorithms_Student_Data_Presentation
Madeleine Organ
 
algoritma klastering.pdf
algoritma klastering.pdfalgoritma klastering.pdf
algoritma klastering.pdf
bintis1
 
How Does Math Matter in Data Science
How Does Math Matter in Data ScienceHow Does Math Matter in Data Science
How Does Math Matter in Data Science
Mutia Ulfi
 

Similar to KNN (20)

k-Nearest Neighbors with brief explanation.pptx
k-Nearest Neighbors with brief explanation.pptxk-Nearest Neighbors with brief explanation.pptx
k-Nearest Neighbors with brief explanation.pptx
 
KNN Algorithm using C++
KNN Algorithm using C++KNN Algorithm using C++
KNN Algorithm using C++
 
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
 
K nearest neighbours
K nearest neighboursK nearest neighbours
K nearest neighbours
 
Fa18_P2.pptx
Fa18_P2.pptxFa18_P2.pptx
Fa18_P2.pptx
 
Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional Managers
 
ch_5_dm clustering in data mining.......
ch_5_dm clustering in data mining.......ch_5_dm clustering in data mining.......
ch_5_dm clustering in data mining.......
 
UNIT 3: Data Warehousing and Data Mining
UNIT 3: Data Warehousing and Data MiningUNIT 3: Data Warehousing and Data Mining
UNIT 3: Data Warehousing and Data Mining
 
26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt
 
Machine Learning with R
Machine Learning with RMachine Learning with R
Machine Learning with R
 
KNN presentation.pdf
KNN presentation.pdfKNN presentation.pdf
KNN presentation.pdf
 
ML SFCSE.pptx
ML SFCSE.pptxML SFCSE.pptx
ML SFCSE.pptx
 
Mini_Project
Mini_ProjectMini_Project
Mini_Project
 
Clustering: A Scikit Learn Tutorial
Clustering: A Scikit Learn TutorialClustering: A Scikit Learn Tutorial
Clustering: A Scikit Learn Tutorial
 
Unsupervised learning clustering
Unsupervised learning clusteringUnsupervised learning clustering
Unsupervised learning clustering
 
Sampling and Data_Update.ppt
Sampling and Data_Update.pptSampling and Data_Update.ppt
Sampling and Data_Update.ppt
 
K-Nearest Neighbor(KNN)
K-Nearest Neighbor(KNN)K-Nearest Neighbor(KNN)
K-Nearest Neighbor(KNN)
 
Classification_Algorithms_Student_Data_Presentation
Classification_Algorithms_Student_Data_PresentationClassification_Algorithms_Student_Data_Presentation
Classification_Algorithms_Student_Data_Presentation
 
algoritma klastering.pdf
algoritma klastering.pdfalgoritma klastering.pdf
algoritma klastering.pdf
 
How Does Math Matter in Data Science
How Does Math Matter in Data ScienceHow Does Math Matter in Data Science
How Does Math Matter in Data Science
 

Recently uploaded

FINAL MATATAG LANGUAGE CG 2023 Grade 1.pdf
FINAL MATATAG LANGUAGE CG 2023 Grade 1.pdfFINAL MATATAG LANGUAGE CG 2023 Grade 1.pdf
FINAL MATATAG LANGUAGE CG 2023 Grade 1.pdf
Janna Marie Ballo
 
Powerpoint on Classroom Orientation2024-2025
Powerpoint on Classroom Orientation2024-2025Powerpoint on Classroom Orientation2024-2025
Powerpoint on Classroom Orientation2024-2025
MarynolMagbanuaJimer
 
QND: VOL2 GRAND FINALE QUIZ by Qui9 (2024)
QND: VOL2  GRAND FINALE QUIZ by Qui9 (2024)QND: VOL2  GRAND FINALE QUIZ by Qui9 (2024)
QND: VOL2 GRAND FINALE QUIZ by Qui9 (2024)
Qui9 (Ultimate Quizzing)
 
Email Marketing in Odoo 17 - Odoo 17 Slides
Email Marketing  in Odoo 17 - Odoo 17 SlidesEmail Marketing  in Odoo 17 - Odoo 17 Slides
Email Marketing in Odoo 17 - Odoo 17 Slides
Celine George
 
How to Load Custom Field to POS in Odoo 17 - Odoo 17 Slides
How to Load Custom Field to POS in Odoo 17 - Odoo 17 SlidesHow to Load Custom Field to POS in Odoo 17 - Odoo 17 Slides
How to Load Custom Field to POS in Odoo 17 - Odoo 17 Slides
Celine George
 
Tale of a Scholar and a Boatman ~ A Story with Life Lessons (Eng. & Chi.).pptx
Tale of a Scholar and a Boatman ~ A Story with Life Lessons (Eng. & Chi.).pptxTale of a Scholar and a Boatman ~ A Story with Life Lessons (Eng. & Chi.).pptx
Tale of a Scholar and a Boatman ~ A Story with Life Lessons (Eng. & Chi.).pptx
OH TEIK BIN
 
Replacing the Whole Capitalist Stack.pdf
Replacing the Whole Capitalist Stack.pdfReplacing the Whole Capitalist Stack.pdf
Replacing the Whole Capitalist Stack.pdf
StefanMz
 
How to Use Quality Module in Odoo 17 - Odoo 17 Slides
How to Use Quality Module in Odoo 17 - Odoo 17 SlidesHow to Use Quality Module in Odoo 17 - Odoo 17 Slides
How to Use Quality Module in Odoo 17 - Odoo 17 Slides
Celine George
 
Brigada Eskwela editable Certificate.pptx
Brigada Eskwela editable Certificate.pptxBrigada Eskwela editable Certificate.pptx
Brigada Eskwela editable Certificate.pptx
aiofits06
 
How to Configure Extra Steps During Checkout in Odoo 17 Website App
How to Configure Extra Steps During Checkout in Odoo 17 Website AppHow to Configure Extra Steps During Checkout in Odoo 17 Website App
How to Configure Extra Steps During Checkout in Odoo 17 Website App
Celine George
 
Multi Language and Language Translation with the Website of Odoo 17
Multi Language and Language Translation with the Website of Odoo 17Multi Language and Language Translation with the Website of Odoo 17
Multi Language and Language Translation with the Website of Odoo 17
Celine George
 
Odoo 17 Project Module : New Features - Odoo 17 Slides
Odoo 17 Project Module : New Features - Odoo 17 SlidesOdoo 17 Project Module : New Features - Odoo 17 Slides
Odoo 17 Project Module : New Features - Odoo 17 Slides
Celine George
 
How to Integrate Facebook in Odoo 17 - Odoo 17 Slides
How to Integrate Facebook in Odoo 17 - Odoo 17 SlidesHow to Integrate Facebook in Odoo 17 - Odoo 17 Slides
How to Integrate Facebook in Odoo 17 - Odoo 17 Slides
Celine George
 
How to Manage Advanced Pricelist in Odoo 17
How to Manage Advanced Pricelist in Odoo 17How to Manage Advanced Pricelist in Odoo 17
How to Manage Advanced Pricelist in Odoo 17
Celine George
 
Module-1_Sectors-of-ICT-and-Its-Career-and-Business-Opportunities-e6qbvs.pptx
Module-1_Sectors-of-ICT-and-Its-Career-and-Business-Opportunities-e6qbvs.pptxModule-1_Sectors-of-ICT-and-Its-Career-and-Business-Opportunities-e6qbvs.pptx
Module-1_Sectors-of-ICT-and-Its-Career-and-Business-Opportunities-e6qbvs.pptx
MichelleMercado36
 
Module 5 Bone, Joints & Muscle Injuries.ppt
Module 5 Bone, Joints & Muscle Injuries.pptModule 5 Bone, Joints & Muscle Injuries.ppt
Module 5 Bone, Joints & Muscle Injuries.ppt
KIPAIZAGABAWA1
 
Bagong Pilipinas Pledge in Power pointpptx
Bagong Pilipinas Pledge in Power pointpptxBagong Pilipinas Pledge in Power pointpptx
Bagong Pilipinas Pledge in Power pointpptx
fantasialomibao
 
Celebrating 25th Year SATURDAY, 27th JULY, 2024
Celebrating 25th Year SATURDAY, 27th JULY, 2024Celebrating 25th Year SATURDAY, 27th JULY, 2024
Celebrating 25th Year SATURDAY, 27th JULY, 2024
APEC Melmaruvathur
 
Bipolar Junction Transistors and operation .pptx
Bipolar Junction Transistors and operation .pptxBipolar Junction Transistors and operation .pptx
Bipolar Junction Transistors and operation .pptx
nitugatkal
 
How to Configure Field Cleaning Rules in Odoo 17
How to Configure Field Cleaning Rules in Odoo 17How to Configure Field Cleaning Rules in Odoo 17
How to Configure Field Cleaning Rules in Odoo 17
Celine George
 

Recently uploaded (20)

FINAL MATATAG LANGUAGE CG 2023 Grade 1.pdf
FINAL MATATAG LANGUAGE CG 2023 Grade 1.pdfFINAL MATATAG LANGUAGE CG 2023 Grade 1.pdf
FINAL MATATAG LANGUAGE CG 2023 Grade 1.pdf
 
Powerpoint on Classroom Orientation2024-2025
Powerpoint on Classroom Orientation2024-2025Powerpoint on Classroom Orientation2024-2025
Powerpoint on Classroom Orientation2024-2025
 
QND: VOL2 GRAND FINALE QUIZ by Qui9 (2024)
QND: VOL2  GRAND FINALE QUIZ by Qui9 (2024)QND: VOL2  GRAND FINALE QUIZ by Qui9 (2024)
QND: VOL2 GRAND FINALE QUIZ by Qui9 (2024)
 
Email Marketing in Odoo 17 - Odoo 17 Slides
Email Marketing  in Odoo 17 - Odoo 17 SlidesEmail Marketing  in Odoo 17 - Odoo 17 Slides
Email Marketing in Odoo 17 - Odoo 17 Slides
 
How to Load Custom Field to POS in Odoo 17 - Odoo 17 Slides
How to Load Custom Field to POS in Odoo 17 - Odoo 17 SlidesHow to Load Custom Field to POS in Odoo 17 - Odoo 17 Slides
How to Load Custom Field to POS in Odoo 17 - Odoo 17 Slides
 
Tale of a Scholar and a Boatman ~ A Story with Life Lessons (Eng. & Chi.).pptx
Tale of a Scholar and a Boatman ~ A Story with Life Lessons (Eng. & Chi.).pptxTale of a Scholar and a Boatman ~ A Story with Life Lessons (Eng. & Chi.).pptx
Tale of a Scholar and a Boatman ~ A Story with Life Lessons (Eng. & Chi.).pptx
 
Replacing the Whole Capitalist Stack.pdf
Replacing the Whole Capitalist Stack.pdfReplacing the Whole Capitalist Stack.pdf
Replacing the Whole Capitalist Stack.pdf
 
How to Use Quality Module in Odoo 17 - Odoo 17 Slides
How to Use Quality Module in Odoo 17 - Odoo 17 SlidesHow to Use Quality Module in Odoo 17 - Odoo 17 Slides
How to Use Quality Module in Odoo 17 - Odoo 17 Slides
 
Brigada Eskwela editable Certificate.pptx
Brigada Eskwela editable Certificate.pptxBrigada Eskwela editable Certificate.pptx
Brigada Eskwela editable Certificate.pptx
 
How to Configure Extra Steps During Checkout in Odoo 17 Website App
How to Configure Extra Steps During Checkout in Odoo 17 Website AppHow to Configure Extra Steps During Checkout in Odoo 17 Website App
How to Configure Extra Steps During Checkout in Odoo 17 Website App
 
Multi Language and Language Translation with the Website of Odoo 17
Multi Language and Language Translation with the Website of Odoo 17Multi Language and Language Translation with the Website of Odoo 17
Multi Language and Language Translation with the Website of Odoo 17
 
Odoo 17 Project Module : New Features - Odoo 17 Slides
Odoo 17 Project Module : New Features - Odoo 17 SlidesOdoo 17 Project Module : New Features - Odoo 17 Slides
Odoo 17 Project Module : New Features - Odoo 17 Slides
 
How to Integrate Facebook in Odoo 17 - Odoo 17 Slides
How to Integrate Facebook in Odoo 17 - Odoo 17 SlidesHow to Integrate Facebook in Odoo 17 - Odoo 17 Slides
How to Integrate Facebook in Odoo 17 - Odoo 17 Slides
 
How to Manage Advanced Pricelist in Odoo 17
How to Manage Advanced Pricelist in Odoo 17How to Manage Advanced Pricelist in Odoo 17
How to Manage Advanced Pricelist in Odoo 17
 
Module-1_Sectors-of-ICT-and-Its-Career-and-Business-Opportunities-e6qbvs.pptx
Module-1_Sectors-of-ICT-and-Its-Career-and-Business-Opportunities-e6qbvs.pptxModule-1_Sectors-of-ICT-and-Its-Career-and-Business-Opportunities-e6qbvs.pptx
Module-1_Sectors-of-ICT-and-Its-Career-and-Business-Opportunities-e6qbvs.pptx
 
Module 5 Bone, Joints & Muscle Injuries.ppt
Module 5 Bone, Joints & Muscle Injuries.pptModule 5 Bone, Joints & Muscle Injuries.ppt
Module 5 Bone, Joints & Muscle Injuries.ppt
 
Bagong Pilipinas Pledge in Power pointpptx
Bagong Pilipinas Pledge in Power pointpptxBagong Pilipinas Pledge in Power pointpptx
Bagong Pilipinas Pledge in Power pointpptx
 
Celebrating 25th Year SATURDAY, 27th JULY, 2024
Celebrating 25th Year SATURDAY, 27th JULY, 2024Celebrating 25th Year SATURDAY, 27th JULY, 2024
Celebrating 25th Year SATURDAY, 27th JULY, 2024
 
Bipolar Junction Transistors and operation .pptx
Bipolar Junction Transistors and operation .pptxBipolar Junction Transistors and operation .pptx
Bipolar Junction Transistors and operation .pptx
 
How to Configure Field Cleaning Rules in Odoo 17
How to Configure Field Cleaning Rules in Odoo 17How to Configure Field Cleaning Rules in Odoo 17
How to Configure Field Cleaning Rules in Odoo 17
 

KNN

  • 1. K-NEAREST NEIGHBOR CLASSIFIER Ajay Krishna Teja Kavuri ajkavuri@mix.wvu.edu
  • 2. OUTLINE • BACKGROUND • DEFINITION • K-NN IN ACTION • K-NN PROPERTIES • REMARKS
  • 3. BACKGROUND “Classification is a data mining technique used to predict group membership for data instances.” • The group membership is utilized in for the prediction of the future data sets.
  • 4. ORIGINS OF K-NN • Nearest Neighbors have been used in statistical estimation and pattern recognition already in the beginning of 1970’s (non- parametric techniques). • The method prevailed in several disciplines and still it is one of the top 10 Data Mining algorithm.
  • 5. MOST CITED PAPERS K-NN has several variations that came out of optimizations through research. Following are most cited publications: • Approximate nearest neighbors: towards removing the curse of dimensionality Piotr Indyk, Rajeev Motwani • Nearest neighbor queries Nick Roussopoulos, Stephen Kelley, Frédéric Vincent • Machine learning in automated text categorization Fabrizio Sebastiani
  • 6. IN A SENTENCE K-NN IS….. • It’s how people judge by observing our peers. • We tend to move with people of similar attributes so does data.
  • 7. DEFINITION • K-Nearest Neighbor is considered a lazy learning algorithm that classifies data sets based on their similarity with neighbors. • “K” stands for number of data set items that are considered for the classification. Ex: Image shows classification for different k-values.
  • 8. TECHNICALLY….. • For the given attributes A={X1, X2….. XD} Where D is the dimension of the data, we need to predict the corresponding classification group G={Y1,Y2…Yn} using the proximity metric over K items in D dimension that defines the closeness of association such that X € RD and Yp € G.
  • 9. THAT IS…. • Attribute A={Color, Outline, Dot} • Classification Group, G={triangle, square} • D=3, we are free to choose K value. Attributes A C l a s s i f i c a t i o n G r o u p
  • 10. PROXIMITY METRIC • Definition: Also termed as “Similarity Measure” quantifies the association among different items. • Following is a table of measures for different data items: Similarity Measure Data Format Contingency Table, Jaccard coefficient, Distance Measure Binary Z-Score, Min-Max Normalization, Distance Measures Numeric Cosine Similarity, Dot Product Vectors
  • 11. PROXIMITY METRIC • For the numeric data let us consider some distance measures: – Manhattan Distance: – Ex: Given X = {1,2} & Y = {2,5} Manhattan Distance = dist(X,Y) = |1-2|+|2-5| = 1+3 = 4
  • 12. PROXIMITY METRIC - Euclidean Distance: - Ex: Given X = {-2,2} & Y = {2,5} Euclidean Distance = dist(X,Y) = [ (-2-2)^2 + (2-5)^2 ]^(1/2) = dist(X,Y) = (16 + 9)^(1/2) = dist(X,Y) = 5
  • 13. K-NN IN ACTION • Consider the following data: A={weight,color} G={Apple(A), Banana(B)} • We need to predict the type of a fruit with: weight = 378 color = red
  • 14. SOME PROCESSING…. • Assign color codes to convert into numerical data: • Let’s label Apple as “A” and Banana as “B”
  • 15. PLOTTING • Using K=3, Our result will be,
  • 16. AS K VARIES…. • Clearly, K has an impact on the classification. Can you guess?
  • 18. K-NN VARIATIONS • Weighted K-NN: Takes the weights associated with each attribute. This can give priority among attributes. Ex: For the data, Weight: Probability: Where, Above is the resulting dataset
  • 19. K-NN VARIATIONS • (K-l)-NN: Reduce complexity by having a threshold on the majority. We could restrict the associations through (K-l)-NN. Ex: Decide if majority is over a given threshold l. Otherwise reject. Here, K=5 and l=4. As there is no majority with count>4. We reject to classify the element.
  • 20. K-NN PROPERTIES • K-NN is a lazy algorithm • The processing defers with respect to K value. • Result is generated after analysis of stored data. • It neglects any intermediate values.
  • 21. REMARKS: FIRST THE GOOD Advantages • Can be applied to the data from any distribution for example, data does not have to be separable with a linear boundary • Very simple and intuitive • Good classification if the number of samples is large enough
  • 22. NOW THE BAD…. Disadvantages • Dependent on K Value • Test stage is computationally expensive • No training stage, all the work is done during the test stage • This is actually the opposite of what we want. Usually we can afford training step to take a long time, but we want fast test step • Need large number of samples for accuracy