Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clustering Example |Simplilearn
•
18 likes•2,195 views
This presentation about hierarchical clustering will help you understand what is clustering, what is hierarchical clustering, how does hierarchical clustering work, what is distance measure, what is agglomerative clustering, what is divisive clustering and you will also see a demo on how to group states based on their sales using clustering method. Clustering is the method of dividing the objects into clusters which are similar between them and are dissimilar to the objects belonging to another cluster. It is used to find data clusters such that each cluster has the most closely matched data. Prototype-based clustering, hierarchical clustering, and density-based clustering are the three types of clustering algorithms. Lets us discuss hierarchical clustering in this video. In simple terms, Hierarchical clustering is separating data into different groups based on some measure of similarity.
Below topics are explained in this "Hierarchical Clustering" presentation:
1. What is clustering?
2. What is hierarchical clustering
3. How hierarchical clustering works?
4. Distance measure
5. What is agglomerative clustering
6. What is divisive clustering
7. Demo: to group states based on their sales
Why learn Machine Learning?
Machine Learning is taking over the world- and with that, there is a growing need among companies for professionals to know the ins and outs of Machine Learning
The Machine Learning market size is expected to grow from USD 1.03 Billion in 2016 to USD 8.81 Billion by 2022, at a Compound Annual Growth Rate (CAGR) of 44.1% during the forecast period.
What skills will you learn from this Machine Learning course?
By the end of this Machine Learning course, you will be able to:
1. Master the concepts of supervised, unsupervised and reinforcement learning concepts and modeling.
2. Gain practical mastery over principles, algorithms, and applications of Machine Learning through a hands-on approach which includes working on 28 projects and one capstone project.
3. Acquire thorough knowledge of the mathematical and heuristic aspects of Machine Learning.
4. Understand the concepts and operation of support vector machines, kernel SVM, naive Bayes, decision tree classifier, random forest classifier, logistic regression, K-nearest neighbors, K-means clustering and more.
5. Be able to model a wide variety of robust Machine Learning algorithms including deep learning, clustering, and recommendation systems
We recommend this Machine Learning training course for the following professionals in particular:
1. Developers aspiring to be a data scientist or Machine Learning engineer
2. Information architects who want to gain expertise in Machine Learning algorithms
3. Analytics professionals who want to work in Machine Learning or artificial intelligence
4. Graduates looking to build a career in data science and Machine Learning
Learn more at www.simplilearn.com
1 of 102
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
More Related Content
Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clustering Example |Simplilearn
2. What’s in it for you?
What is Clustering?
What is Hierarchical Clustering?
How Hierarchical Clustering works?
Distance Measure
What is Agglomerative Clustering?
What is Divisive Clustering?
10. What is Clustering?
It will group places with least distance
The method of dividing the objects into clusters which are similar between them and are dissimilar
to the objects belonging to another cluster
11. What is Clustering?
It will group places with least distance
The method of dividing the objects into clusters which are similar between them and are dissimilar
to the objects belonging to another cluster
Partial
Clustering
Hierarchical
Clustering
12. What is Clustering?
It will group places with least distance
The method of dividing the objects into clusters which are similar between them and are dissimilar
to the objects belonging to another cluster
Partial
Clustering
Hierarchical
Clustering
Agglomerative Divisive
13. What is Clustering?
It will group places with least distance
The method of dividing the objects into clusters which are similar between them and are dissimilar
to the objects belonging to another cluster
Partial
Clustering
Hierarchical
Clustering
Agglomerative Divisive K-means Fuzzy C-Means
18. What is Hierarchical Clustering?
It will group places with least distance
Let’s consider that we have a set of cars and we have to group similar ones together
19. What is Hierarchical Clustering?
It will group places with least distance
Hierarchical Clustering creates a tree like structure and group similar objects together
20. What is Hierarchical Clustering?
It will group places with least distance
The grouping is done till we reach the last cluster
21. What is Hierarchical Clustering?
It will group places with least distance
Hierarchical Clustering is separating data into different groups based on some measure of similarity
22. Types of Hierarchical Clustering
It will group places with least distance
Agglomerative
It is known as Bottom-up approach
23. Types of Hierarchical Clustering
It will group places with least distance
Agglomerative Divisive
It is known as Top Down approach
25. What is Hierarchical Clustering?
Convergence
0
1
2
3
4
5
6
0 2 4 6 8
Y-Values
P1P2
P3
P4
P5 P6
Termination
Grouping
Measure the
distance
• Let’s consider we have few points on a plane
26. What is Hierarchical Clustering?
Convergence
0
1
2
3
4
5
6
0 2 4 6 8
Y-Values
P1P2
P3
P4
P5 P6
Termination
Grouping
Measure the
distance
• Each data point is a cluster of its own
27. What is Hierarchical Clustering?
Convergence
0
1
2
3
4
5
6
0 2 4 6 8
Y-Values
P1P2
P3
P4
P5 P6
Termination
Grouping
Measure the
distance
• Each data point is a cluster of its own
• We try to find the least distance between two data points/cluster
28. What is Hierarchical Clustering?
0
1
2
3
4
5
6
0 2 4 6 8
Y-Values
P1P2
P3
P4
P5 P6
0
1
2
3
4
5
6
0 2 4 6 8
Y-Values
P1P2
P5 P6
P3
P4
• The two nearest clusters/datapoints are merged together
Termination
Grouping
Measure the
distance
29. What is Hierarchical Clustering?
0
1
2
3
4
5
6
0 2 4 6 8
Y-Values
P1P2
P3
P4
P5 P6
0
1
2
3
4
5
6
0 2 4 6 8
Y-Values
P1P2
P5 P6
P3
P4
• The two nearest clusters/datapoints are merged together
Termination
Grouping
Measure the
distance
P2 P1
• This is represented in a tree like structure called Dendrogram
30. What is Hierarchical Clustering?
0
1
2
3
4
5
6
0 2 4 6 8
Y-Values
P1P2
P3
P4
P5 P6
0
1
2
3
4
5
6
0 2 4 6 8
Y-Values
P1P2
P5 P6
P3
P4
• The two nearest clusters/datapoints are merged together
Termination
Grouping
Measure the
distance
• This is represented in a tree like structure called Dendrogram
P3P2 P1 P4
31. What is Hierarchical Clustering?
0
1
2
3
4
5
6
0 2 4 6 8
Y-Values
P1P2
P3
P4
P5 P6
0
1
2
3
4
5
6
0 2 4 6 8
Y-Values
P1P2
P5 P6
P3
P4
P5 P6
• The two nearest clusters/datapoints are merged together
Termination
Grouping
Measure the
distance
• This is represented in a tree like structure called Dendrogram
P5 P6P3 P4P2 P1
32. What is Hierarchical Clustering?
0
1
2
3
4
5
6
0 2 4 6 8
Y-Values
P1P2
P3
P4
P5 P6
0
1
2
3
4
5
6
0 2 4 6 8
Y-Values
P1P2
P5 P6
P3
P4
P5 P6
• The two nearest clusters/datapoints are merged together
Termination
Grouping
Measure the
distance
• This is represented in a tree like structure called Dendrogram
P5 P6P3 P4P2 P1
33. What is Hierarchical Clustering?
0
1
2
3
4
5
6
0 2 4 6 8
Y-Values
P1P2
P3
P4
P5 P6
0
0.2
0.4
0.6
0.8
1
1.2
0 0.5 1 1.5
Y-Values
P6
P3
P4
P6
• We terminate when we are left with only one clusters
Termination
Grouping
Measure the
distance
P6P3P2 P1
P
P5P4
34. What is Hierarchical Clustering?
It will group places with least distance
An algorithm that builds hierarchy of clusters
0
1
2
3
4
5
6
0 2 4 6 8
Y-Values
P1P2
P5 P6
P3
P4
P5 P6 P2 P1 P3 P4
?
How do we measure the distance
between the data points?
41. Euclidean Distance Measure
• The Euclidean distance is the "ordinary" straight line
• It is the distance between two points in Euclidean space
d=√ 𝑖=1
𝑛
( 𝑞𝑖− )2
p
q
Euclidian
Distance
𝑝𝑖
Option 02
Euclidean distance
measure
01
Squared euclidean
distance measure
02
Manhattan distance
measure
03
Cosine distance
measure
04
42. Squared Euclidean Distance Measure
The Euclidean squared distance metric uses the same equation as the
Euclidean distance metric, but does not take the square root.
d= 𝑖=1
𝑛
( 𝑞𝑖− )2
𝑝𝑖
Option 02
Euclidean distance
measure
01
Squared euclidean
distance measure
02
Manhattan distance
measure
03
Cosine distance
measure
04
43. Manhattan Distance Measure
Option 02
Euclidean distance
measure
01
Squared euclidean
distance measure
02
Manhattan distance
measure
03
Cosine distance
measure
04
The Manhattan distance is the simple sum of the horizontal and vertical
components or the distance between two points measured along axes at right angles
d= 𝑖=1
𝑛
| 𝑞 𝑥− |
p
q
Manhattan
Distance
𝑝 𝑥 +|𝑞 𝑦− |𝑝 𝑦
(x,y)
(x,y)
44. Cosine Distance Measure
Option 02
Euclidean distance
measure
01
Squared euclidean
distance measure
02
Manhattan distance
measure
03
Cosine distance
measure
04
The cosine distance similarity measures the angle between the two vectors
p
q
Cosine
Distance
𝑖=0
𝑛−1
𝑞𝑖−
𝑖=0
𝑛−1
(𝑞𝑖)2
× 𝑖=0
𝑛−1
(𝑝𝑖)2
d=
𝑝 𝑥
46. What is Agglomerative Clustering?
It will group places with least distance
Agglomerative Clustering begins with each element as a separate cluster and merge them into larger clusters
47. What is Agglomerative Clustering?
It will group places with least distance
There are three key questions that needs to be answered
How do we represent a cluster of more than one point?
48. What is Agglomerative Clustering?
It will group places with least distance
There are three key questions that needs to be answered
How do we determine the nearness of clusters?
How do we represent a cluster of more than one point?
49. What is Agglomerative Clustering?
It will group places with least distance
There are three key questions that needs to be answered
How do we represent a cluster of more than one point?
How do we determine the nearness of clusters?
When to stop combining clusters?
50. What is Agglomerative Clustering?
It will group places with least distance
(1,2)
(2,1)
(0,0)
(4,1)
(5,3)
(5,0)
Let’s assume that we have 6 points in a Euclidean space
51. What is Agglomerative Clustering?
It will group places with least distance
(1,2)
(2,1)
(0,0)
(4,1)
(5,3)
(5,0)
Let’s assume that we have 6 points in a Euclidean space
?
How do we
represent a cluster
of more than one
point?
52. What is Agglomerative Clustering?
It will group places with least distance
(1,2)
(2,1)
(0,0)
(4,1)
(5,3)
(5,0)
Let’s assume that we have 6 points in a Euclidean space
We make use of
centroids which is
the average of it’s
points
53. What is Agglomerative Clustering?
It will group places with least distance
(1,2)
(2,1)
(0,0)
(4,1)
(5,3)
(5,0)
Let’s assume that we have 6 points in a Euclidean space
54. What is Agglomerative Clustering?
It will group places with least distance
(1,2)
(2,1)
(0,0)
(4,1)
(5,3)
(5,0)
Let’s assume that we have 6 points in a Euclidean space
(1.5,1.5)
55. What is Agglomerative Clustering?
It will group places with least distance
(1,2)
(2,1)
(0,0)
(4,1)
(5,3)
(5,0)
Let’s assume that we have 6 points in a Euclidean space
(1.5,1.5)
56. What is Agglomerative Clustering?
It will group places with least distance
(1,2)
(2,1)
(0,0)
(4,1)
(5,3)
(5,0)
Let’s assume that we have 6 points in a Euclidean space
(1.5,1.5)
(4.5,0.5)
57. What is Agglomerative Clustering?
It will group places with least distance
(1,2)
(2,1)
(0,0)
(4,1)
(5,3)
(5,0)
Let’s assume that we have 6 points in a Euclidean space
(1.5,1.5)
(4.5,0.5)
(1,1)
58. What is Agglomerative Clustering?
It will group places with least distance
(1,2)
(2,1)
(0,0)
(4,1)
(5,3)
(5,0)
Let’s assume that we have 6 points in a Euclidean space
(1.5,1.5)
(4.5,0.5)
(4.7,1.3)
(1,1)
59. What is Agglomerative Clustering?
It will group places with least distance
(1,2)
(2,1)
(0,0)
(4,1)
(5,3)
(5,0)
Let’s assume that we have 6 points in a Euclidean space
(1.5,1.5)
(4.5,0.5)
(4.7,1.3)
(1,1)
60. What is Agglomerative Clustering?
It will group places with least distance
(1,2)
(2,1)
(0,0)
(4,1)
(5,3)
(5,0)
Let’s assume that we have 6 points in a Euclidean space
(1.5,1.5)
(4.5,0.5)
(4.7,1.3)
(1,1)
?
When to stop
combining clusters?
61. What is Agglomerative Clustering?
It will group places with least distance
There are many approaches to it
62. What is Agglomerative Clustering?
It will group places with least distance
There are many approaches to it
Approach 1: Pick a number of clusters(k) upfront
We decide the number of clusters required in the beginning and we terminate when we
reach the value(k)
63. What is Agglomerative Clustering?
It will group places with least distance
There are many approaches to it
Possible Challenges
This only makes sense when we know about the data
Approach 1: Pick a number of clusters(k) upfront
We decide the number of clusters required in the beginning and we terminate when we
reach the value(k)
64. What is Agglomerative Clustering?
It will group places with least distance
There are many approaches to it
Approach 2: Stop when the next merge would create a cluster with low “cohesion”
65. What is Agglomerative Clustering?
It will group places with least distance
There are many approaches to it
Approach 2: Stop when the next merge would create a cluster with low “cohesion”
We keep clustering till the next merge of clusters creates a bad cluster/low cohesion
66. What is Agglomerative Clustering?
It will group places with least distance
There are many approaches to it
Approach 2: Stop when the next merge would create a cluster with low “cohesion”
We keep clustering till the next merge of clusters creates a bad cluster/low cohesion
?
But, how is cohesion
defined?
67. What is Agglomerative Clustering?
It will group places with least distance
There are many approaches to it
Approach 2: Stop when the next merge would create a cluster with low “cohesion”
We keep clustering till the next merge of clusters creates a bad cluster/low cohesion
?
Approach 3.1: Diameter of a cluster
• Diameter is the maximum distance between any pair of points in cluster
68. What is Agglomerative Clustering?
It will group places with least distance
There are many approaches to it
Approach 2: Stop when the next merge would create a cluster with low “cohesion”
We keep clustering till the next merge of clusters creates a bad cluster/low cohesion
?
Approach 3.1: Diameter of a cluster
• Diameter is the maximum distance between any pair of points in cluster
• We terminate when the diameter of a new cluster exceeds the threshold
69. What is Agglomerative Clustering?
It will group places with least distance
There are many approaches to it
Approach 2: Stop when the next merge would create a cluster with low “cohesion”
We keep clustering till the next merge of clusters creates a bad cluster/low cohesion
?
Approach 3.1: Radius of a cluster
70. What is Agglomerative Clustering?
It will group places with least distance
There are many approaches to it
Approach 2: Stop when the next merge would create a cluster with low “cohesion”
We keep clustering till the next merge of clusters creates a bad cluster/low cohesion
?
Approach 3.1: Radius of a cluster
• Radius is the maximum distance of a point from centroid
71. What is Agglomerative Clustering?
It will group places with least distance
There are many approaches to it
Approach 2: Stop when the next merge would create a cluster with low “cohesion”
We keep clustering till the next merge of clusters creates a bad cluster/low cohesion
?
Approach 3.1: Radius of a cluster
• Radius is the maximum distance of a point from centroid
• We terminate when the diameter of a new cluster exceeds the threshold
73. What is Divisive Clustering?
It will group places with least distance
Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
74. What is Divisive Clustering?
It will group places with least distance
Convergence
Step 1
• Start with a single cluster composed of all the data points
Step 2
Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
75. What is Divisive Clustering?
It will group places with least distance
Convergence
Step 1
• Start with a single cluster composed of all the data points
• Split it into different clustersStep 2
Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
76. What is Divisive Clustering?
It will group places with least distance
Convergence
Step 2
Step 1
• Start with a single cluster composed of all the data points
• This can be done using Monothethic divisive methods
• Split it into different clusters
Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
77. What is Divisive Clustering?
It will group places with least distance
Convergence
Step 1
• Start with a single cluster composed of all the data points
• Split this into different clusters
• This can be done using Monothethic divisive methods
Step 2
?
What is monothetic divisive method?
Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
78. What is Divisive Clustering?
It will group places with least distance
Convergence
Step 1
• Start with a single cluster composed of all the data points
• There are two ways to do this
1. Monothethic divisive methods
2. Polythetic divisive methods
?
A,B,C,D,E,F
• Obtain all possible splits into two clusters
Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
79. What is Divisive Clustering?
It will group places with least distance
Convergence
Step 1
• Start with a single cluster composed of all the data points
?
• Obtain all possible splits into two clusters
A,B,C,D,E,F
C,D,E,F
A,B
Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
80. What is Divisive Clustering?
It will group places with least distance
Convergence
Step 1
• Start with a single cluster composed of all the data points
• Split this into different clusters
• There are two ways to do this
?
• Obtain all possible splits into two clusters
A,B,C,D,E,F
A,D,F
C,D,E,F
A,B
B,C,E
Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
81. What is Divisive Clustering?
It will group places with least distance
Convergence
Step 1
• Start with a single cluster composed of all the data points
• Split this into different clusters
• There are two ways to do this
1. Monothethic divisive methods
2. Polythetic divisive methods
?
• Obtain all possible splits into two clusters
A,B,C,D,E,F
A,D,F
C,D,E,F
A,B
B,C,E
Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
A,B,C
D,E,F
82. What is Divisive Clustering?
It will group places with least distance
Convergence
Step 1
• Start with a single cluster composed of all the data points
• There are two ways to do this
1. Monothethic divisive methods
2. Polythetic divisive methods
?
• For each split compute cluster sum of squares
Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
83. What is Divisive Clustering?
It will group places with least distance
Convergence
Step 1
• Start with a single cluster composed of all the data points
• There are two ways to do this
1. Monothethic divisive methods
2. Polythetic divisive methods
?
• For each split compute cluster sum of squares
• We select the cluster with largest sum of squares
Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
84. What is Divisive Clustering?
It will group places with least distance
Convergence
Step 1
• Start with a single cluster composed of all the data points
?
• Let’s assume that the sum of squared distance is largest for 3rd split
A,B,C,D,E,F
Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
85. What is Divisive Clustering?
It will group places with least distance
Convergence
Step 1
• Start with a single cluster composed of all the data points
?
• We divide it into two clusters
A,B,C
A,B,C,D,E,F
Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
86. What is Divisive Clustering?
It will group places with least distance
Convergence
Step 1
• Start with a single cluster composed of all the data points
?
A,B,C D,E,F
A,B,C,D,E,F
Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
• We divide it into two clusters
87. What is Divisive Clustering?
It will group places with least distance
Convergence
Step 1
• Start with a single cluster composed of all the data points
?
A,B,C D,E,F
A B,C
A,B,C,D,E,F
Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
• We divide it into two clusters
88. What is Divisive Clustering?
It will group places with least distance
Convergence
Step 1
• Start with a single cluster composed of all the data points
?
A,B,C D,E,F
A B,C
A,B,C,D,E,F
A,B,C D,E,F
A B,C D E,F
A,B,C,D,E,F
Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
• We divide it into two clusters
89. What is Divisive Clustering?
It will group places with least distance
Convergence
Step 1
• Start with a single cluster composed of all the data points
?
A,B,C D,E,F
A B,C D E,F
A B C
A,B,C,D,E,F
Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
• We divide it into two clusters
90. What is Divisive Clustering?
It will group places with least distance
Convergence
Step 1
• Start with a single cluster composed of all the data points
?
• We terminate when every data point is it’s own cluster
A,B,C D,E,F
A B,C D E,F
A B C D E F
A,B,C,D,E,F
Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
95. Demo: Hierarchical Clustering
Problem Statement
• To group petroleum companies based on their sales
Steps?
• Create a scatter plot
• Import the dataset
96. Demo: Hierarchical Clustering
Problem Statement
• To group petroleum companies based on their sales
Steps?
• Create a scatter plot
• Import the dataset
• Normalize the data
97. Demo: Hierarchical Clustering
Problem Statement
• To group petroleum companies based on their sales
Steps?
• Create a scatter plot
• Import the dataset
• Normalize the data
• Calculate Euclidean Distance
98. Demo: Hierarchical Clustering
Problem Statement
• To group petroleum companies based on their sales
Steps?
• Create a scatter plot
• Import the dataset
• Normalize the data
• Calculate Euclidean Distance
• Create a dendogram
99. Demo: Hierarchical Clustering
Problem Statement
• To group petroleum companies based on their sales
Steps?
• Create a scatter plot
• Import the dataset
• Normalize the data
• Calculate Euclidean Distance
• Create a dendogram
• Cluster into groups