Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Week 15 - Clustering

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 25

Clustering

Nallig Leal Narváez


PhD en Ingeniería de Sistemas y
Computación

Universidad Autónoma del Caribe


CONTENT

Fundamentals

• Learning
• Classification and Clustering

Clustering
• K-Means Clustering
Fundamentals – Learning

Supervised Unsupervised
Classification and Clustering
Classification and clustering are two methods
of pattern identification used in machine learning.
Although both techniques have certain similarities,
the difference lies in the fact that classification
uses predefined classes in which objects are
assigned, while clustering identifies similarities
between objects, which it groups according to those
characteristics in common and which differentiate
them from other groups of objects. These groups are
known as "clusters". [1]

[1] https://blog.bismart.com/en/classification-vs.-clustering-a-practical-explanation
K-Means Clustering – What is it?

The K-Means algorithm is an unsupervised machine learning classification


algorithm, which groups objects into K groups based on their characteristics

Graphics were taken from StatQuest of Josh Starmer


K-Means Clustering – How it works?

Suppose you have data that you need to put in three clusters

Graphics were taken from StatQuest of Josh Starmer


K-Means Clustering – How it works?

In this case the data make three, relatively obvious, clusters

Taken from StatQuest of Josh Starmer


K-Means Clustering – How it works?

Step 1

Select the number K of clusters you want to identify in your data. For example K = 3

Taken from StatQuest of Josh Starmer


K-Means Clustering – How it works?

Step 2

Randomly select 3 distinct data points. These are the initial clusters

Taken from StatQuest of Josh Starmer


K-Means Clustering – How it works?

Step 3

Measure the distance between the 1st point and the three initial clusters

Taken from StatQuest of Josh Starmer


K-Means Clustering – How it works?

Step 4

Assign the 1st point to the nearest cluster. In this case the nearest cluster is the blue
one.

Taken from StatQuest of Josh Starmer


K-Means Clustering – How it works?

Step 4

Then, do the same for the other points.

Taken from StatQuest of Josh Starmer


K-Means Clustering – How it works?
Calculate the mean of each cluster
Step 5

The repeat the process using the mean values. Since the clustering do not change in
this iteration, the process ends.

Taken from StatQuest of Josh Starmer


K-Means Clustering – How it works?

Taken from StatQuest of Josh Starmer


K-Means Clustering – How it works?

Taken from StatQuest of Josh Starmer


K-Means Clustering – How it works?

Taken from StatQuest of Josh Starmer


K-Means Clustering – How many clusters?

Taken from StatQuest of Josh Starmer


K-Means Clustering – How many clusters?

Taken from StatQuest of Josh Starmer


K-Means Clustering – How many clusters?

Taken from StatQuest of Josh Starmer


K-Means Clustering – How many clusters?

Taken from StatQuest of Josh Starmer


K-Means Clustering – How many clusters?

Taken from StatQuest of Josh Starmer


K-Means Clustering – How many clusters?

Taken from StatQuest of Josh Starmer


K-Means Clustering – How many clusters?

Taken from StatQuest of Josh Starmer


K-Means Clustering – How many clusters?

Taken from StatQuest of Josh Starmer


K-Means Clustering in two dimensions

Taken from StatQuest of Josh Starmer

You might also like