Decision Tree Algorithm
Decision Tree Algorithm
Hands-On Example
Entropy
Gini Impurity
As the first step, we have to find the parent node for our decision tree.
For that follow the steps:
note: Here typically we will take log to base 2.Here total there are 14
yes/no. Out of which 9 yes and 5 no.Based on it we calculated
probability above.
From the above data for outlook we can arrive at the following table
easily
Now we have to calculate average weighted entropy. ie, we have
found the total of weights of each feature multiplied by probabilities.
Now select the feature having the largest entropy gain. Here it is
Outlook. So it forms the first node(root node) of our decision tree.
Now our data look as follows
The next step is to find the next node in our decision tree. Now we will
find one under sunny. We have to determine which of the following
Temperature, Humidity or Wind has higher information gain.
Calculate parent entropy E(sunny)
Similarly we get
For humidity from the above table, we can say that play will occur if
humidity is normal and will not occur if it is high. Similarly, find the
nodes under rainy.
So as the first step we will find the root node of our decision
tree. For that Calculate the Gini index of the class variable
As the next step, we will calculate the Gini gain. For that first, we
will find the average weighted Gini impurity of Outlook, Temperature,
Humidity, and Windy.
Choose one that has a higher Gini gain. Gini gain is higher for outlook.
So we can choose it as our root node.
Now you have got an idea of how to proceed further. Repeat the same
steps we used in the ID3 algorithm.
Advantages:
Disadvantages:
References:
https://www.saedsayad.com/decision_tree.htm
Applied-ai course