Lectures 5 and 6 - Data Anaysis in Management - MBM
Lectures 5 and 6 - Data Anaysis in Management - MBM
Lectures 5 and 6 - Data Anaysis in Management - MBM
Cluster analysis
Outline of the lecture
Example of clustering
3
Learning objectives
Upon completing this lecture, you should be able to:
The greater the distance, the less similar the objects are.
4. Repeat steps 2 and 3 until all clusters have been merged into
a single cluster containing all observations.
Common steps in cluster analysis
Select a clustering algorithm (procedure).
The primary difference among hierarchical clustering
algorithms is their definitions of cluster distances (step 2).
a(i) for the i-th object is the average distance between i-th
object and all other objects in the same cluster
b(i) for the i-th object and any cluster not containing the
object is the smallest average distance between i-th object
and all other objects in any other cluster not containing this
object.
Common steps in cluster analysis
Choice of the number of clusters k in k-means method.
The Silhouette coefficient
The silhouette coefficient ranges from −1 to +1.
Thus the average s(i) over all data of the entire dataset is a
measure of how appropriately the data have been clustered.