Decision Trees For Classification - A Machine Learning Algorithm - Xoriant Blog
Decision Trees For Classification - A Machine Learning Algorithm - Xoriant Blog
P R O D U C T E N G I N E E R I N G ( H T T P S : / / W W W . X O R I A N T. C O M / B L O G / C A T E G O R Y / P R O D U C T -
ENGINEERING)
(https://www.xoriant.com/cookie-policy)
No, give me more info OK, I agree
1 of 17 12/19/2018, 11:09 AM
Decision Trees for Classification: A Machine Learning Algorithm | Xori... https://www.xoriant.com/blog/product-engineering/decision-trees-machi...
Introduction
Decision Trees are a type of Supervised Machine Learning (that is you explain what the input is and what the corresponding
output is in the training data) where the data is continuously split according to a certain parameter. The tree can be explained
by two entities, namely decision nodes and leaves. The leaves are the decisions or the final outcomes. And the decision
(https://www.xoriant.com/blog/wp-content/uploads/2017/08/Decision-
Trees-modified-1.png)
An example of a decision tree can be explained using above binary tree. Let’s say you want to predict whether a person is fit
given their information like age, eating habit, and physical activity, etc. The decision nodes here are questions like ‘What’s the
age?’, ‘Does he exercise?’, ‘Does he eat a lot of pizzas’? And the leaves, which are outcomes like either ‘fit’, or ‘unfit’. In this
What we’ve seen above is an example of classification tree, where the outcome was a variable like ‘fit’ or ‘unfit’. Here the
Here the decision or the outcome variable is Continuous, e.g. a number like 123.
Working
Now that we know what a Decision Tree is, we’ll see how it works internally. There are many algorithms out there which
construct Decision Trees, but one of the best is called as ID3 Algorithm. ID3 Stands for Iterative Dichotomiser 3.
Entropy
(https://www.xoriant.com/cookie-policy)
No, give me more info OK, I agree
2 of 17 12/19/2018, 11:09 AM
Decision Trees for Classification: A Machine Learning Algorithm | Xori... https://www.xoriant.com/blog/product-engineering/decision-trees-machi...
(https://www.xoriant.com/blog/wp-content/uploads/2017/08/Decision-Trees-modified-2.jpg)
Intuitively, it tells us about the predictability of a certain event. Example, consider a coin toss whose probability of heads is 0.5
and probability of tails is 0.5. Here the entropy is the highest possible, since there’s no way of determining what the outcome
might be. Alternatively, consider a coin which has heads on both the sides, the entropy of such an event can be predicted
perfectly since we know beforehand that it’ll always be heads. In other words, this event has no randomness hence it’s
entropy is zero.
In particular, lower values imply less uncertainty while higher values imply high uncertainty.
Information Gain
Information gain is also called as Kullback-Leibler divergence denoted by IG(S,A) for a set S is the effective change in entropy
after deciding on a particular attribute A. It measures the relative change in entropy with respect to the independent variables.
(https://www.xoriant.com/blog/wp-content/uploads/2017/08/Decision-Trees-modified-3.jpg)
Alternatively,
(https://www.xoriant.com/blog/wp-content/uploads/2017/08/Decision-Trees-
modified-4.jpg)
where IG(S, A) is the information gain by applying feature A. H(S) is the Entropy of the entire set, while the second term
calculates the Entropy after applying the feature A, where P(x) is the probability of event x.
Consider a piece of data collected over the course of 14 days where the features are Outlook, Temperature, Humidity, Wind
and the outcome variable is whether Golf was played on the day. Now, our job is to build a predictive model which takes in
above 4 parameters and predicts whether Golf will be played on the day. We’ll build a decision tree to do that using ID3
algorithm.
(https://www.xoriant.com/cookie-policy)
No, give me more info OK, I agree
3 of 17 12/19/2018, 11:09 AM
Decision Trees for Classification: A Machine Learning Algorithm | Xori... https://www.xoriant.com/blog/product-engineering/decision-trees-machi...
5. For each attribute, calculate the entropy with respect to the attribute ‘x’ denoted by H(S, x)
7. Remove the attribute that offers highest IG from the set of attributes
8. Repeat until we run out of all attributes, or the decision tree has all leaf nodes.
Now we’ll go ahead and grow the decision tree. The initial step is to calculate H(S), the Entropy of the current state. In the
above example, we can see in total there are 5 No’s and 9 Yes’s.
Yes No Total
9 5 14
(https://www.xoriant.com/blog/wp-content/uploads/2017/08/Decision-
(https://www.xoriant.com/cookie-policy)
No, give me more info OK, I agree
4 of 17 12/19/2018, 11:09 AM
Decision Trees for Classification: A Machine Learning Algorithm | Xori... https://www.xoriant.com/blog/product-engineering/decision-trees-machi...
other half belong to other class that is perfect randomness. Here it’s 0.94 which means the distribution is fairly random.
Now the next step is to choose the attribute that gives us highest possible Information Gain which we’ll choose as the
root node.
(https://www.xoriant.com/blog/wp-content/uploads/2017/08/Decision-Trees-
modified-6.jpg)
where ‘x’ are the possible values for an attribute. Here, attribute ‘Wind’ takes two possible values in the sample data, hence x
= {Weak, Strong}
have 8 places where the wind is weak and 6 where the wind is Strong.
8 6 14
(https://www.xoriant.com/blog/wp-content/uploads/2017/08/Decision-
Trees-modified-8.jpg)
Now out of the 8 Weak examples, 6 of them were ‘Yes’ for Play Golf and 2 of them were ‘No’ for ‘Play Golf’. So, we have,
(https://www.xoriant.com/cookie-policy)
No, give me more info OK, I agree
5 of 17 12/19/2018, 11:09 AM
Decision Trees for Classification: A Machine Learning Algorithm | Xori... https://www.xoriant.com/blog/product-engineering/decision-trees-machi...
Trees-modified-9.jpg)
Similarly, out of 6 Strong examples, we have 3 examples where the outcome was ‘Yes’ for Play Golf and 3 where we had
(https://www.xoriant.com/blog/wp-content/uploads/2017/08/Decision-
Trees-modified-10.jpg)
Remember, here half items belong to one class while other half belong to other. Hence we have perfect randomness.
Now we have all the pieces required to calculate the Information Gain,
(https://www.xoriant.com/blog/wp-content/uploads/2017/08/Decision-
Trees-modified-11.jpg)
Which tells us the Information Gain by considering ‘Wind’ as the feature and give us information gain of 0.048. Now we must
(https://www.xoriant.com/blog/wp-content/uploads/2017/08/Decision-Trees-
modified-12.jpg)
We can clearly see that IG(S, Outlook) has the highest information gain of 0.246, hence we chose Outlook attribute as the
(https://www.xoriant.com/cookie-policy)
No, give me more info OK, I agree
6 of 17 12/19/2018, 11:09 AM
Decision Trees for Classification: A Machine Learning Algorithm | Xori... https://www.xoriant.com/blog/product-engineering/decision-trees-machi...
(https://www.xoriant.com/blog/wp-content/uploads/2017/08/Decision-Trees-
modified-13.jpg)
Here we observe that whenever the outlook is Overcast, Play Golf is always ‘Yes’, it’s no coincidence by any chance, the
simple tree resulted because of the highest information gain is given by the attribute Outlook.
Now how do we proceed from this point? We can simply apply recursion, you might want to look at the algorithm steps
described earlier.
Now that we’ve used Outlook, we’ve got three of them remaining Humidity, Temperature, and Wind. And, we had three
possible values of Outlook: Sunny, Overcast, Rain. Where the Overcast node already ended up having leaf node ‘Yes’, so
(https://www.xoriant.com/blog/wp-content/uploads/2017/08/Decision-
Trees-modified-33.jpg)
(https://www.xoriant.com/blog/wp-content/uploads/2017/08/Decision-
(https://www.xoriant.com/cookie-policy)
No, give me more info OK, I agree
7 of 17 12/19/2018, 11:09 AM
Decision Trees for Classification: A Machine Learning Algorithm | Xori... https://www.xoriant.com/blog/product-engineering/decision-trees-machi...
(https://www.xoriant.com/blog/wp-content/uploads/2017/08/Decision-Trees-modified-
15.jpg)
As we can see the highest Information Gain is given by Humidity. Proceeding in the same way with
highest information gain. The final Decision Tree looks something like this.
(https://www.xoriant.com/blog/wp-content/uploads/2017/08/Decision-
Trees-modified-16.jpg)
Code:
(https://www.xoriant.com/cookie-policy)
No, give me more info OK, I agree
8 of 17 12/19/2018, 11:09 AM
Decision Trees for Classification: A Machine Learning Algorithm | Xori... https://www.xoriant.com/blog/product-engineering/decision-trees-machi...
import pydotplus
from sklearn.datasets import load_iris
from sklearn import tree
from IPython.display import Image, display
__author__ = "Mayur Kulkarni <mayur.kulkarni@xoriant.com>"
def load_data_set():
"""
Loads the iris data set
:return: data set instance
"""
iris = load_iris()
return iris
def train_model(iris):
"""
Train decision tree classifier
:param iris: iris data set instance
:return: classifier instance
"""
clf = tree.DecisionTreeClassifier()
clf = clf.fit(iris.data, iris.target)
return clf
def display_image(clf, iris):
"""
Displays the decision tree image
:param clf: classifier instance
:param iris: iris data set instance
"""
dot_data = tree.export_graphviz(clf, out_file=None,
feature_names=iris.feature_names,
class_names=iris.target_names,
filled=True, rounded=True)
graph = pydotplus.graph_from_dot_data(dot_data)
display(Image(data=graph.create_png()))
if __name__ == '__main__':
iris_data = load_iris()
decision_tree_classifier = train_model(iris_data)
display_image(clf=decision_tree_classifier, iris=iris_data)
(https://www.xoriant.com/cookie-policy)
No, give me more info OK, I agree
9 of 17 12/19/2018, 11:09 AM
Decision Trees for Classification: A Machine Learning Algorithm | Xori... https://www.xoriant.com/blog/product-engineering/decision-trees-machi...
(https://www.xoriant.com/blog/wp-content/uploads/2017/08/Decision-Trees-modified-88.png)
Conclusion:
1. Entropy to measure discriminatory power of an attribute for classification task. It defines the amount of randomness in
attribute for classification task. Entropy is minimal means the attribute appears close to one class and have a good
2. Information Gain to rank attribute for filtering at given node in the tree. The ranking is based on high information gain
References:
(https://www.xoriant.com/cookie-policy)
No, give me more info OK, I agree
10 of 17 12/19/2018, 11:09 AM
Decision Trees for Classification: A Machine Learning Algorithm | Xori... https://www.xoriant.com/blog/product-engineering/decision-trees-machi...
/watch?v=O__7lAqni7A)
/~ddd/cap6635/Fall-97/Short-papers/2.htm)
Share
Mayur Kulkarni
Software Engineer
(https://www.xoriant.com/cookie-policy)
No, give me more info OK, I agree
11 of 17 12/19/2018, 11:09 AM
Decision Trees for Classification: A Machine Learning Algorithm | Xori... https://www.xoriant.com/blog/product-engineering/decision-trees-machi...
adapter.html)
Turbocharge Your Data Science Machine Learning For Developers
contribution-apache-arrow-jdbc-
adapter.html)
REPLY (HTTPS://WWW.XORIANT.COM/BLOG/PRODUCT-ENGINEERING/DECISION-TREES-MACHINE-LEARNING-
ALGORITHM.HTML?REPLYTOCOM=665587#RESPOND)
Michael Scolfield
September 12, 2017 at 7:13 am (https://www.xoriant.com/blog/product-engineering/decision-trees-machine-
learning-algorithm.html#comment-665587)
REPLY (HTTPS://WWW.XORIANT.COM/BLOG/PRODUCT-ENGINEERING/DECISION-TREES-MACHINE-LEARNING-
ALGORITHM.HTML?REPLYTOCOM=665837#RESPOND)
Mayur Kulkarni
September 13, 2017 at 10:50 am (https://www.xoriant.com/blog/product-engineering/decision-trees-machi
learning-algorithm.html#comment-665837)
(https://www.xoriant.com/cookie-policy)
No, give me more info OK, I agree
12 of 17 12/19/2018, 11:09 AM
Decision Trees for Classification: A Machine Learning Algorithm | Xori... https://www.xoriant.com/blog/product-engineering/decision-trees-machi...
KRISHNA KADABUR
October 3, 2017 at 5:33 pm (https://www.xoriant.com/blog/product-engineering/decision-trees-machine-learning
algorithm.html#comment-665981)
nice article
REPLY (HTTPS://WWW.XORIANT.COM/BLOG/PRODUCT-ENGINEERING/DECISION-TREES-MACHINE-LEARNING-
ALGORITHM.HTML?REPLYTOCOM=669496#RESPOND)
priya
March 17, 2018 at 9:55 am (https://www.xoriant.com/blog/product-engineering/decision-trees-machine-learning-
algorithm.html#comment-669496)
Hi,
I have read your Datascience blog.It”s very attractive and impressive.Very useful to me, I like it your blog
Thank you
priya
learning/)
Pingback: Is machine learning entirely based on decision tree? | Physics Forums (https://www.physicsforums.com/threads/is-machin
learning-entirely-based-on-decision-tree.960229/#post-6089702)
Your email address will not be published. Required fields are marked *
(https://www.xoriant.com/cookie-policy)
No, give me more info OK, I agree
13 of 17 12/19/2018, 11:09 AM
Decision Trees for Classification: A Machine Learning Algorithm | Xori... https://www.xoriant.com/blog/product-engineering/decision-trees-machi...
Name*
Email*
Website
By submitting your information to Xoriant, you agree to our privacy (/privacy-policy) and cookie (/cookie-policy) policies.
POST COMMENT
Name *
Email *
By submitting your information to Xoriant, you agree to our privacy (/privacy-policy) and cookie (/cookie-policy) policies.
SUBSCRIBE
(https://www.xoriant.com/cookie-policy)
No, give me more info OK, I agree
14 of 17 12/19/2018, 11:09 AM
Decision Trees for Classification: A Machine Learning Algorithm | Xori... https://www.xoriant.com/blog/product-engineering/decision-trees-machi...
CATEGORIES
TAGS
automation (https://www.xoriant.com/blog/tag/automation)
BI (https://www.xoriant.com/blog/tag/bi)
(https://www.xoriant.com/cookie-policy)
No, give me more info OK, I agree
15 of 17 12/19/2018, 11:09 AM
Decision Trees for Classification: A Machine Learning Algorithm | Xori... https://www.xoriant.com/blog/product-engineering/decision-trees-machi...
Jenkins (https://www.xoriant.com/blog/tag/jenkins)
microservices (https://www.xoriant.com/blog/tag/microservices)
Swagger (https://www.xoriant.com/blog/tag/swagger)
UI/UX (https://www.xoriant.com/blog/tag/uiux)
(https://www.xoriant.com/cookie-policy)
No, give me more info OK, I agree
16 of 17 12/19/2018, 11:09 AM
Decision Trees for Classification: A Machine Learning Algorithm | Xori... https://www.xoriant.com/blog/product-engineering/decision-trees-machi...
(http://
www.y
outube
.com
/chann
el
(http://l (https:/ /UCU0 (http://
(http://f inkedi /plus.g c158x www.sl
acebo (http://t n.com oogle. CKKX idesha
ok.co witter.c /comp com COqE re.net
m om anies /+xoria Fwi7R /xorian
/Xoria /xorian /16699 nt 3A tcorpor
nt) t) 6) /posts) /feed) ation)
(https://www.xoriant.com/cookie-policy)
No, give me more info OK, I agree
17 of 17 12/19/2018, 11:09 AM