entropy and information gain for decision tree algorithm

Xrd6 gu9tx sjfsmzvmzv 59d fuptuf hda khx. Yotd mhtyc

Uploaded by

engtawkibhasan

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views

entropy and information gain for decision tree algorithm

Xrd6 gu9tx sjfsmzvmzv 59d fuptuf hda khx. Yotd mhtyc

Uploaded by

engtawkibhasan

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Decision trees are a popular machine learning algorithm for both classification and regression tasks.

They
work by recursively splitting the dataset into subsets based on feature values, creating a tree-like
structure of decisions that leads to predictions. Here’s an overview of decision trees and some
commonly used algorithms:

1. Basic Concept of Decision Trees

• Nodes: Each node represents a feature (or attribute) in the dataset.

• Edges: Each branch from a node represents a decision based on that feature’s value.

• Leaf Nodes: Represent the final output (class or value) after all decisions have been made.

• Root Node: The topmost node in a tree, representing the initial feature or question.

2. Decision Tree Algorithms

a) ID3 (Iterative Dichotomiser 3)

• Developed by Ross Quinlan, ID3 is one of the earliest algorithms.

• Criterion: It uses information gain to decide which feature to split on, favoring splits that result
in the greatest reduction in entropy.

• Limitations: Prone to overfitting and can’t handle numeric data directly without modification.

b) C4.5

• An extension of ID3, also developed by Quinlan.

• Criterion: Uses gain ratio to handle continuous and categorical data better than ID3.

• Pruning: Implements pruning to reduce overfitting.

• Handling of Missing Values: C4.5 can handle datasets with missing values more effectively than
ID3.

c) CART (Classification and Regression Trees)

• Developed by Leo Breiman, CART is widely used in both classification and regression.

• Criterion: For classification, CART uses Gini impurity as the splitting criterion, while for
regression, it uses mean squared error (MSE).

• Binary Splits Only: CART splits the data into exactly two branches at each node, creating binary
trees.

• Pruning: Prunes trees based on a cost-complexity parameter to manage overfitting.

d) CHAID (Chi-Square Automatic Interaction Detector)

• CHAID is used for categorical data and is based on the chi-square test.

• Criterion: Uses statistical tests (chi-square for classification, ANOVA for regression) to determine
splits.
• Multifurcating Splits: Unlike CART, CHAID can create branches with multiple splits from a single
node.

• Use Cases: Often used for market research and survey analysis.

3. Advantages of Decision Trees

• Interpretability: Easy to understand and visualize, even for non-experts.

• Non-linearity: Can model non-linear relationships.

• Little Data Preprocessing: Often requires minimal data preparation, like normalization or scaling.

4. Limitations of Decision Trees

• Overfitting: Decision trees can easily overfit, especially with deep trees.

• Bias: Sensitive to small changes in data, which can lead to vastly different trees (high variance).

• Preference for Certain Features: Tend to favor features with more levels.

5. Applications of Decision Trees

• Classification tasks (e.g., spam detection, customer churn prediction)

• Regression tasks (e.g., predicting housing prices)

• Feature selection

To calculate entropy and information gain for building a decision tree:

entropy and information gain calculation with a small dataset and use it to build a decision tree. We’ll
also make a prediction for a given input based on the final decision.
Details:
https://towardsdatascience.com/decision-tree-in-machine-learning-
e380942a4c96

HSMC
No ratings yet
HSMC
5 pages
Decision Tree
No ratings yet
Decision Tree
6 pages
Assignment Decision Tree
No ratings yet
Assignment Decision Tree
15 pages
Assignment of Decision Tree in Machine Learning
No ratings yet
Assignment of Decision Tree in Machine Learning
15 pages
Experiment No-2
No ratings yet
Experiment No-2
4 pages
Learning Decision Trees
No ratings yet
Learning Decision Trees
13 pages
Decision Trees - Neha Chowdhary PPT
No ratings yet
Decision Trees - Neha Chowdhary PPT
20 pages
Unit-5 Decision Trees & Ensembles Methods
No ratings yet
Unit-5 Decision Trees & Ensembles Methods
11 pages
Decision Tree
No ratings yet
Decision Tree
16 pages
Decision Tree Ppt
0% (1)
Decision Tree Ppt
24 pages
An Introduction TO Decision Trees
No ratings yet
An Introduction TO Decision Trees
30 pages
AI&Ml-module 4 (Part 1)
No ratings yet
AI&Ml-module 4 (Part 1)
85 pages
AI&Ml-module 4 (Complete)
No ratings yet
AI&Ml-module 4 (Complete)
124 pages
Aiml M4 C1
No ratings yet
Aiml M4 C1
101 pages
12500221027
No ratings yet
12500221027
12 pages
Decision Tree
No ratings yet
Decision Tree
31 pages
My Decision Tree Algorithm
No ratings yet
My Decision Tree Algorithm
21 pages
DECISION TREES-jb
No ratings yet
DECISION TREES-jb
8 pages
Decision Tree Is An Upside
No ratings yet
Decision Tree Is An Upside
7 pages
Data Mining Notes Unit 4
No ratings yet
Data Mining Notes Unit 4
30 pages
Unit 3
No ratings yet
Unit 3
31 pages
Unit-3 Introduction To Machine Learning Algorithms
No ratings yet
Unit-3 Introduction To Machine Learning Algorithms
18 pages
Decision Tree and Related Techniques For Classification in Scalation
No ratings yet
Decision Tree and Related Techniques For Classification in Scalation
12 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
Lesson 7 Supervised Method (Decision Trees) Algorithms
No ratings yet
Lesson 7 Supervised Method (Decision Trees) Algorithms
12 pages
Decision Tree Induction Algorithm
No ratings yet
Decision Tree Induction Algorithm
6 pages
Decision Trees Cheat Sheet PDF
No ratings yet
Decision Trees Cheat Sheet PDF
2 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
TEAA_ Tree Ensembles-1
No ratings yet
TEAA_ Tree Ensembles-1
43 pages
Analysis of Various Decision Tree Algorithms For Classification in Data Mining PDF
No ratings yet
Analysis of Various Decision Tree Algorithms For Classification in Data Mining PDF
5 pages
Decision Trees - 2022
No ratings yet
Decision Trees - 2022
49 pages
AIML Removed Merged
No ratings yet
AIML Removed Merged
31 pages
AIML Removed
No ratings yet
AIML Removed
25 pages
Business Analytics: Foundation: Material Handouts
No ratings yet
Business Analytics: Foundation: Material Handouts
7 pages
Chapter 2 Types of Machine Learning and Their Learning Strategies
No ratings yet
Chapter 2 Types of Machine Learning and Their Learning Strategies
45 pages
STAT 451: Machine Learning Lecture Notes: Sebastian Raschka Department of Statistics University of Wisconsin-Madison
No ratings yet
STAT 451: Machine Learning Lecture Notes: Sebastian Raschka Department of Statistics University of Wisconsin-Madison
18 pages
Decision Tree Algorithm
No ratings yet
Decision Tree Algorithm
14 pages
Lecture 17 18
No ratings yet
Lecture 17 18
52 pages
Decision Tree
No ratings yet
Decision Tree
15 pages
Unit-3 Alt
No ratings yet
Unit-3 Alt
24 pages
ML Unit 2
No ratings yet
ML Unit 2
8 pages
Lecture Note #5_PEC-CS701E
No ratings yet
Lecture Note #5_PEC-CS701E
16 pages
Machine_Learning_Lecture_08_Decision Tree Learning (1)
No ratings yet
Machine_Learning_Lecture_08_Decision Tree Learning (1)
67 pages
Decision Trees_ a Complete Introduction With Examples _ by Shubham Koli _ Medium
No ratings yet
Decision Trees_ a Complete Introduction With Examples _ by Shubham Koli _ Medium
22 pages
Prac 6
No ratings yet
Prac 6
6 pages
Introduction to Decision Trees
No ratings yet
Introduction to Decision Trees
10 pages
CS446: Machine Learning: Lecture 21 (ML Models - Decision Trees - ID3)
No ratings yet
CS446: Machine Learning: Lecture 21 (ML Models - Decision Trees - ID3)
54 pages
Decision Trees and Regression Techniques
No ratings yet
Decision Trees and Regression Techniques
27 pages
Machine Learning Approaches: Decision Trees
No ratings yet
Machine Learning Approaches: Decision Trees
44 pages
2179-Unit-3
No ratings yet
2179-Unit-3
29 pages
Machine Learning-Lecture 05
No ratings yet
Machine Learning-Lecture 05
21 pages
Decision Tree
No ratings yet
Decision Tree
5 pages
ML UNIT4
No ratings yet
ML UNIT4
10 pages
DM Lab 04
No ratings yet
DM Lab 04
6 pages
Decision Trees
No ratings yet
Decision Trees
15 pages
ML Unit 3 New
No ratings yet
ML Unit 3 New
24 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
25 pages
Decision Tree
No ratings yet
Decision Tree
7 pages
U4 ML Updated
No ratings yet
U4 ML Updated
32 pages
Decision Tree Pruning: Fundamentals and Applications
From Everand
Decision Tree Pruning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Decision Tree
No ratings yet
Decision Tree
24 pages
Hands On Machine Learning with Scikit Learn and TensorFlow Early Release 2nd Edition Aurélien Géron download pdf
100% (3)
Hands On Machine Learning with Scikit Learn and TensorFlow Early Release 2nd Edition Aurélien Géron download pdf
65 pages
PredictingTitanicSurvivorsusing by Applying Exploratory Data Anyltics and ML
No ratings yet
PredictingTitanicSurvivorsusing by Applying Exploratory Data Anyltics and ML
7 pages
Subject: Importing The Dataset
No ratings yet
Subject: Importing The Dataset
13 pages
12395-Article (PDF) - 25776-2-10-20210118
No ratings yet
12395-Article (PDF) - 25776-2-10-20210118
39 pages
Fuzzy Decision Tree
No ratings yet
Fuzzy Decision Tree
12 pages
Rapid Minder Assignment
No ratings yet
Rapid Minder Assignment
38 pages
Road Accident Analysis and Prediction of Accident Severity by Using Machine Learning in Bangladesh
No ratings yet
Road Accident Analysis and Prediction of Accident Severity by Using Machine Learning in Bangladesh
6 pages
Predicting The Term Deposit Subscription
No ratings yet
Predicting The Term Deposit Subscription
38 pages
Unit 5 Iot II
No ratings yet
Unit 5 Iot II
9 pages
Machine Learning Tutorial - 1
No ratings yet
Machine Learning Tutorial - 1
1 page
Stable Variable Selection For Right Censored Data: Comparison of Methods
No ratings yet
Stable Variable Selection For Right Censored Data: Comparison of Methods
29 pages
Predictive Analytics: Course Syllabus
No ratings yet
Predictive Analytics: Course Syllabus
8 pages
Statistical Machine Learning
No ratings yet
Statistical Machine Learning
28 pages
DM Witten 03
No ratings yet
DM Witten 03
56 pages
Soil Nutrient Analysis (1)
No ratings yet
Soil Nutrient Analysis (1)
9 pages
Spam Email Classification Using Decision Tree Ensemble
No ratings yet
Spam Email Classification Using Decision Tree Ensemble
8 pages
International Conference On Emerging Trends in Engineering, Science and Technology (ICETEST - 2015)
No ratings yet
International Conference On Emerging Trends in Engineering, Science and Technology (ICETEST - 2015)
10 pages
Data Mining in CRM: Analytics-Intelligent Management of Product Life Cycle
No ratings yet
Data Mining in CRM: Analytics-Intelligent Management of Product Life Cycle
40 pages
AIML Interview Questions
No ratings yet
AIML Interview Questions
17 pages
sci8
No ratings yet
sci8
3 pages
DataMiningForTheMasses Cap 1
No ratings yet
DataMiningForTheMasses Cap 1
10 pages
Machine Learning-Based Protection and Fault Identification of 100% Inverter-Based Microgrids
No ratings yet
Machine Learning-Based Protection and Fault Identification of 100% Inverter-Based Microgrids
4 pages
Artificial Intelligence Approach For Modeling House Price Prediction
No ratings yet
Artificial Intelligence Approach For Modeling House Price Prediction
5 pages
Data Science Methodology: Pertemuan Iv
No ratings yet
Data Science Methodology: Pertemuan Iv
80 pages
Week 4 Part 1 Classification
No ratings yet
Week 4 Part 1 Classification
71 pages
Using Machine Learning Techniques To Identify Rare Cyber Attacks On The UNSW NB15 Dataset
No ratings yet
Using Machine Learning Techniques To Identify Rare Cyber Attacks On The UNSW NB15 Dataset
14 pages
Data Science Interview Questions 30 Days 1686062665
No ratings yet
Data Science Interview Questions 30 Days 1686062665
300 pages
Accelerate Your Workflow With Data Analytics
0% (1)
Accelerate Your Workflow With Data Analytics
49 pages
Black Scholes
No ratings yet
Black Scholes
41 pages