Partitioning Around Medoid: K-Medoids
Partitioning Around Medoid: K-Medoids
Partitioning Around Medoid: K-Medoids
1987 by Kaufman and Rousseeuw. A medoid can be defined as the point in the cluster, whose
dissimilarities with all the other points in the cluster is minimum.
The dissimilarity of the medoid (Ci) and object (Pi) is calculated by using E = |Pi - Ci|
The cost in K-Medoids algorithm is given as
C1:03
C2:05
Each point is assigned to the cluster of that medoid whose dissimilarity is less.
Step 3: randomly select one non-medoid point and recalculate the cost.
Let the randomly selected point be (8, 4). The dissimilarity of each non-medoid point with
the medoids – C1 (4, 5) and C2 (8, 4) is calculated and tabulated.
C1: (4, 5) and C2: (8, 4)
Each point is assigned to that cluster whose dissimilarity is less. So, the points 1, 2, 5 go to
cluster C1 and 0, 3, 6, 7, 8 go to cluster C2.
The New cost = (3 + 4 + 4) + (2 + 2 + 1 + 3 + 3) = 22
Swap Cost = New Cost – Previous Cost = 22 – 20 and 2 >0
As the swap cost is not less than zero, we undo the swap so C1: (4, 5) and C2 :( 8, 5) will
remain final medoids
Advantages:
1. It is simple to understand and easy to implement.
2. K-Medoid Algorithm is fast and converges in a fixed number of steps.
3. PAM is less sensitive to outliers than other partitioning algorithms.
Disadvantages:
1. The main disadvantage of K-Medoid algorithms is that it is not suitable for clustering
non-spherical (arbitrary shaped) groups of objects. This is because it relies on
minimizing the distances between the non-medoid objects and the medoid (the cluster
centre) – briefly, it uses compactness as clustering criteria instead of connectivity.
2. It may obtain different results for different runs on the same dataset because the first
k medoids are chosen randomly.