Decision Tree Notes
Decision Tree Notes
● Decision tree as the name suggests it is a flow like a tree structure that
terminal node.
conclusion of the test and every leaf node means the class label.
Last Part:
● Node splitting, or simply splitting, is the process of dividing a node into multiple
● There are multiple ways of doing this, which can be broadly divided into two
● Entropy(Gini Impurity)
● Information Gain
● Chi-Square
1. At each stage (node), pick out the best feature as the test condition.
2. Now split the node into the possible outcomes (internal nodes).
3. Repeat the above steps till all the test conditions have been exhausted into leaf
nodes.
When you start to implement the algorithm, the first question is: ‘How to pick the
starting test condition?’
The answer to this question lies in the values of ‘Entropy’ and ‘Information Gain’.
Let us see what are they and how do they impact our decision tree creation.
Entropy: Entropy in Decision Tree stands for homogeneity. If the data is
completely homogenous, the entropy is 0, else if the data is divided (50-50%)
entropy is 1.