How Decision Tree Algorithm Works
How Decision Tree Algorithm Works
The general motive of using Decision Tree is to create a training model which
can use to predict class or value of target variables by learning decision rules
inferred from prior data(training data).
1. Place the best attribute of the dataset at the root of the tree.
2. Split the training set into subsets. Subsets should be made in such a way
that each subset contains data with the same value for an attribute.
3. Repeat step 1 and step 2 on each subset until you nd leaf nodes in all the
branches of the tree.
http://dataaspirant.com/2017/01/30/how-decision-tree-algorithm-works/ 2/16
11/20/2017 How Decision Tree Algorithm works
In decision trees, for predicting a class label for a record we start from the root
of the tree. We compare the values of the root attribute with record’s attribute.
On the basis of comparison, we follow the branch corresponding to that value
and jump to the next node.
We continue comparing our record’s attribute values with other internal nodes
of the tree until we reach a leaf node with predicted class value. As we know
how the modeled decision tree can be used to predict the target class or the
value. Now let’s understanding how we can create the decision tree model.
The below are the some of the assumptions we make while using Decision tree:
Decision Trees follow Sum of Product (SOP) representation. For the above
images, you can see how we can predict can we accept the new job o er?
and Use computer daily? from traversing for the root node to the leaf node.
http://dataaspirant.com/2017/01/30/how-decision-tree-algorithm-works/ 3/16
11/20/2017 How Decision Tree Algorithm works
Information gain
Gini index
Attributes Selection
For solving this attribute selection problem, researchers worked and devised
some solutions. They suggested using some criterion like information gain, gini
index, etc. These criterions will calculate values for every attribute. The values
are sorted, and attributes are placed in the tree by following the order i.e,
the attribute with a high value(in case of information gain) is placed at the root.
Information Gain
http://dataaspirant.com/2017/01/30/how-decision-tree-algorithm-works/ 4/16
11/20/2017 How Decision Tree Algorithm works
For a binary classi cation problem with only two classes, positive and negative
class.
If all examples are positive or all are negative then entropy will be zero i.e,
low.
If half of the records are of positive class and half are of negative class
then entropy is one i.e, high.
A, B, C, D attributes can be
considered as predictors and E
column class labels can be
considered as a target variable. For
constructing a decision tree from this
data, we have to convert continuous
data into categorical data.
A B C D
http://dataaspirant.com/2017/01/30/how-decision-tree-algorithm-works/ 5/16
11/20/2017 How Decision Tree Algorithm works
There are 2 steps for calculating information gain for each attribute:
The entropy of Target: We have 8 records with negative class and 8 records
with positive class. So, we can directly estimate the entropy of target as 1.
Variable E
Positive Negative
158
Shares
8 8
3
E(8,8) = -1*( (p(+ve)*log( p(+ve)) + (p(-ve)*log( p(-ve)) )
= -1*( (8/16)*log2(8/16)) + (8/16) * log2(8/16) )
=1
Var A has value >=5 for 12 records out of 16 and 4 records with value <5 value.
http://dataaspirant.com/2017/01/30/how-decision-tree-algorithm-works/ 6/16
11/20/2017 How Decision Tree Algorithm works
Var B has value >=3 for 12 records out of 16 and 4 records with value <5 value.
Var C has value >=4.2 for 6 records out of 16 and 10 records with value <4.2
value.
http://dataaspirant.com/2017/01/30/how-decision-tree-algorithm-works/ 7/16
11/20/2017 How Decision Tree Algorithm works
Var D has value >=1.4 for 5 records out of 16 and 11 records with value <5
value.
Target Target
>= >=
5 7 8 4
5.0 3.0
A B
Target
Target
Positive Nega
Positive Negative
>=
>= 4.2 0 6 0 5
1.4
C D
< 4.2 8 2
< 1.4 8 3
http://dataaspirant.com/2017/01/30/how-decision-tree-algorithm-works/ 8/16
11/20/2017 How Decision Tree Algorithm works
An Attribute with better value than other should position as root and A branch
with entropy 0 should be converted to a leaf node. A branch with entropy more
than 0 needs further splitting.
Gini Index
Gini Index is a metric to measure how often a randomly chosen element would
be incorrectly identi ed. It means an attribute with lower gini index should be
preferred.
We are going to use same data sample that we used for information gain
example. Let’s try to use gini index as a criterion. Here, we have 5 columns out
of which 4 columns have continuous data and 5th column consists of class
labels.
http://dataaspirant.com/2017/01/30/how-decision-tree-algorithm-works/ 9/16
11/20/2017 How Decision Tree Algorithm works
A, B, C, D attributes can be
considered as predictors and E
column class labels can be
considered as a target variable. For
constructing a decision tree from
this data, we have to convert
continuous data into categorical
data.
A B C D
Var A has value >=5 for 12 records out of 16 and 4 records with value <5 value.
Var B has value >=3 for 12 records out of 16 and 4 records with value <5 value.
http://dataaspirant.com/2017/01/30/how-decision-tree-algorithm-works/ 10/16
11/20/2017 How Decision Tree Algorithm works
Var C has value >=4.2 for 6 records out of 16 and 10 records with value <4.2
value.
Var D has value >=1.4 for 5 records out of 16 and 11 records with value <1.4
value.
wTarget Target
http://dataaspirant.com/2017/01/30/how-decision-tree-algorithm-works/ 11/16
11/20/2017 How Decision Tree Algorithm works
>= >=
5 7 8 4
5.0 3.0
A B
Target
Target
Positive Neg
Positive Negative
>=
>= 4.2 0 6 0 5
1.4
C D
< 4.2 8 2
< 1.4 8 3
Over tting
Over tting is a practical problem while building a decision tree model. The
model is having an issue of over tting is considered when the algorithm
continues to go deeper and deeper in the to reduce the training set error but
results with an increased test set error i.e, Accuracy of prediction for our model
goes down. It generally happens when it builds many branches due to outliers
and irregularities in data.
Pre-Pruning
Post-Pruning
Pre-Pruning
http://dataaspirant.com/2017/01/30/how-decision-tree-algorithm-works/ 12/16
11/20/2017 How Decision Tree Algorithm works
In pre-pruning, it stops the tree construction bit early. It is preferred not to split
a node if its goodness measure is below a threshold value. But it’s di cult to
choose an appropriate stopping point.
Post-Pruning
In post-pruning rst, it goes deeper and deeper in the tree to build a complete
tree. If the tree shows the over tting problem then pruning is done as a post-
pruning step. We use a cross-validation data to check the e ect of our pruning.
Using cross-validation data, it tests whether expanding a node will make an
improvement or not.
Advantages:
Disadvantages:
http://dataaspirant.com/2017/01/30/how-decision-tree-algorithm-works/ 13/16
11/20/2017 How Decision Tree Algorithm works
Follow us:
FACEBOOK| QUORA |TWITTER| GOOGLE+ | LINKEDIN| REDDIT | FLIPBOARD | MEDIUM | GIT
I hope you like this post. If you have any questions, then feel free to comment
below. If you want me to write on one particular topic, then do tell it to me in
the comments below.
Related Courses:
Master
Learnin
Machine Learning A-Z: Hands- Make r
On Python & R In Data Science Learnin
Handle
like Rei
Students Enrolled:: 19,359
Learnin
Course Overall Rating:: 4.6 Deep L
Build a
powerf
Learnin
know h
them to
problem
http://dataaspirant.com/2017/01/30/how-decision-tree-algorithm-works/ 14/16
11/20/2017 How Decision Tree Algorithm works
sentim
review
Analyze
to pred
default
Evaluat
using p
metrics
Implem
techniq
(or in th
your ch
Python
recomm
http://dataaspirant.com/2017/01/30/how-decision-tree-algorithm-works/ 15/16
11/20/2017 How Decision Tree Algorithm works
Naive B
random
The cou
the com
of build
functio
data co
creatio
and eva
Share this:
3 150
Related
visualize decision tree in How the random forest Building Decision Tree
python with graphviz algorithm works in machine Algorithm in Python with
April 21, 2017 learning scikit learn
In "Machine Learning" May 22, 2017 February 1, 2017
In "Machine Learning" In "Data Science"
http://dataaspirant.com/2017/01/30/how-decision-tree-algorithm-works/ 16/16