Using ID3 Decision Tree Algorithm To The Student Grade Analysis and Prediction

Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, PDF URL: http://www.ijtsrd.com/papers/ijtsrd26545.pdfPaper URL: https://www.ijtsrd.com/computer-science/data-miining/26545/using-id3-decision-tree-algorithm-to-the-student-grade-analysis-and-prediction/khin-khin-lay

Uploaded by

Editor IJTSRD

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

78 views

Using ID3 Decision Tree Algorithm To The Student Grade Analysis and Prediction

Uploaded by

Editor IJTSRD

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

International Journal of Trend in Scientific Research and Development (IJTSRD)

Volume 3 Issue 5, August 2019 Available Online: www.ijtsrd.com e-ISSN: 2456 – 6470

Using ID3 Decision Tree Algorithm to the

Student Grade Analysis and Prediction
Khin Khin Lay, San San Nwe
Associate Professor, University of Computer Studies, Maubin, Myanmar

How to cite this paper: Khin Khin Lay | ABSTRACT

San San Nwe "Using ID3 Decision Tree Data mining techniques play an important role in data analysis. For the
Algorithm to the Student Grade Analysis construction of a classification model which could predict performance of
and Prediction" students, particularly for engineering branches, a decision tree algorithm
Published in associated with the data mining techniques have been used in the research. A
International number of factors may affect the performance of students. Data mining
Journal of Trend in technology which can related to this student grade well and we also used
Scientific Research classification algorithms prediction. In this paper, we used educational data
and Development mining to predict students' final grade based on their performance. We
(ijtsrd), ISSN: 2456- IJTSRD26545 proposed student data classification using ID3(Iterative Dichotomiser 3)
6470, Volume-3 | Decision Tree Algorithm.
Issue-5, August 2019, pp.1392-1395,
https://doi.org/10.31142/ijtsrd26545
KEYWORDS: Classification, ID3, Data Mining, Decision Tree, Predicting
Copyright © 2019 by author(s) and Performance
International Journal of Trend in Scientific I. INTRODUCTION
Research and Development Journal. This Educational data mining is an interesting research area which extracts useful,
is an Open Access article distributed previously unknown patterns from educational database for better
under the terms of understanding, improved educational performance and assessment of the
the Creative student learning process (Surjeet & Saurabh, 2012). The main functionality of
Commons Attribution data mining techniques is applying various methods and algorithms in order to
License (CC BY 4.0) discover and extract patterns of stored data. These interesting patterns are
(http://creativecommons.org/licenses/by presented to the user and may be stored as new knowledge in knowledge base.
/4.0)
Data mining has been used in areas such as database scenario of decision and its outcome (Surjeet & Saurabh,
systems, data warehousing, statistics, machine learning, data 2012).
visualization, and information retrieval.
In data mining, decision trees can be described also as the
Data mining techniques have been introduced to new areas combination of mathematical and computational techniques
including neural networks, patterns recognition, spatial data to aid the description, categorization and generalization of a
analysis, image databases and many application fields such given set of data. The four widely used decision tree learning
as business, economics and bioinformatics. Some types of algorithms are: ID3, CART, CHAID and C4.5.
data mining techniques are: Clustering, Association Rule
Mining, Neural Networks, Genetic Algorithms, Nearest II. RELATED WORK
Neighbor Method, Classification Rule Mining, Decision trees In order to predict the performance of students the
and many others. The outcome of their results indicated that researcher took into consideration the work of other 14 A
Decision Tree model had better prediction than other Decision Tree Approach for Predicting Students Academic
models. Performance researchers that are in the same direction.
Other researchers have looked at the work of predicting
A decision tree is a flow-chart-like tree structure, where each students’ performance by applying many approaches and
internal node is denoted by rectangles, and leaf nodes are coming up with diverse results.
denoted by ovals. All internal nodes have two or more child
nodes. All internal nodes contain splits, which test the value Three supervised data mining algorithms, i.e. Bayesian,
of an expression of the attributes. Arcs from an internal node Decision trees and Neural Networks which were applied by
to its children are labelled with distinct outcomes of the test. [1] on the preoperative assessment data to predict success in
Each leaf node has a class label associated with it. a course (to produce result as either passed or failed) and
the performance of the learning methods were evaluated
Decision tree are commonly used for gaining information for based on their predictive accuracy, ease of learning and user
the purpose of decision -making. Decision tree starts with a friendly characteristics. The researchers observed that that
root node on which it is for users to take actions. this methodology can be used to help students and teachers
to improve student’s performance; reduce failing ratio by
From this node, users split each node recursively according taking appropriate steps at right time to improve the quality
to decision tree learning algorithm. The final result is a of learning.
decision tree in which each branch represents a possible

@ IJTSRD | Unique Paper ID – IJTSRD26545 | Volume – 3 | Issue – 5 | July - August 2019 Page 1392
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
[2] compared four different classifiers and combined the most balanced splitting. The information gain metric is such
results into a multiple classifier. Their research divided the a function.
data into three (3) different classes weighing the features
and using a genetic algorithm to minimize the error rate The basic idea of ID3 algorithm is to construct the decision
improves the prediction accuracy at least 10% in the all tree by employing a top-down, greedy search through the
cases of 2, 3 and 9-Classes. In cases where the number of given sets to test each attribute at every tree node. In order
features is low, the feature weighting worked much better to select the attribute that is most useful for classifying a
than feature selection. The successful optimization of given sets, we introduce a metric - information gain. To find
student classification in all three cases demonstrates the an optimal way to classify a learning set we need some
merits of using the LON-CAPA data to predict the students‟ function which provides the most balanced splitting. The
final grades based on their features, which are extracted information gain metric is such a function. Given a data table
from the homework data. However, the research in this case that contains attributes and class of the attributes, we can
was based on an online course as opposed to the regular measure homogeneity of the table based on the classes. The
classroom class that the present study considers. index used to measure degree of impurity is Entropy [2]. The
Entropy is calculated as follows: Splitting criteria used for
Furthermore, [3] observed that in the problem of prediction splitting of nodes of the tree is Information gain. To
of performance, it is possible to automatically predict determine the best attribute for a particular node in the tree
students’ performance. Moreover by using extensible we use the measure called Information Gain.
classification formalism such as Bayesian networks, which
was employed in their research it becomes possible to easily B. Advantage of ID3
and uniformly integrate such knowledge into the learning Understandable prediction rules are created from the
task. The researchers‟ experiments also show the need for training data.
methods aimed at predicting performance and exploring Builds the fastest tree.
more learning algorithms. Builds a short tree.
Only need to test enough attributes until all data is
Also, [8] used Iterative Dichotomiser 3 (ID3) decision tree classified.
algorithm to predict the university students‟ grade of a Finding leaf nodes enables test data to be pruned,
university in Nigeria. A prediction accuracy of 79,556 was reducing number of tests.
obtained from the model. They further suggested the use of
other decision based model to predict student’s C. Disadvantage of ID3
performance. Data may be over-fitted or over classified, if a small
sample is tested.
III. OUR PROPOSED METHOD Only one attribute at a time is tested for making a
A. The ID3 Decision Tree decision.
ID3 is a simple decision tree learning algorithm developed Classifying continuous data may be computationally
by Ross Quinlan (1983). The basic idea of ID3 algorithm is to expensive, as many trees must be generated to see
construct the decision tree by employing a top-down, greedy where to break the continuum.
search through the given sets to test each attribute at every
tree node. In order to select the attribute that is most useful IV. Data Preparation
for classifying a given sets, we introduce a metric- The first step in this paper is to collect data. It is important to
information gain. select the most suitable attributes which influence the
student performance. We have training set of 30 under
To find an optimal way to classify a learning set, what we graduate students. We were provided with a training dataset
need to do is to minimize the questions asked (i.e. consisting of information about students admitted to the
minimizing the depth of the tree). Thus, we need some first year in Table I.
function which can measure which questions provide the

TableI Training Data Set

Sr. no. Roll no. Attend-ance Apti- tute Assign-ment Test Presentation Grade
1 IT1 Good Avg Yes Pass Good Excellent
2 IT2 Good Avg Yes Pass Good Excellent
3 IT 3 Good Avg Yes Pass Good Excellent
4 IT4 Good Avg Yes Pass Good Excellent
5 IT5 Good Avg Yes Pass Good Excellent
6 IT6 Avg Avg Yes Pass Avg Good
7 IT7 Poor Good Yes Pass Avg Good
8 IT8 Avg Good Yes Pass Avg Good
9 IT9 Avg Good Yes Pass Avg Good
10 IT10 Poor Poor No Fail Poor Fail
11 IT11 Poor Poor No Fail Poor Fail
12 IT12 Avg Age Yes Pass Age Good
13 IT13 Good Good Yes Pass Good Excellent
14 IT14 Good Good Yes Pass Good Excellent
15 IT15 Good Good Yes Pass Good Excellent

@ IJTSRD | Unique Paper ID – IJTSRD26545 | Volume – 3 | Issue – 5 | July - August 2019 Page 1393
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
16 IT16 Good Good Yes Pass Good Excellent
17 IT17 Good Avg Yes Pass Good Excellent
18 IT18 Good Avg Yes Pass Good Excellent
19 IT19 Good Avg Yes Pass Good Excellent
20 IT20 Good Poor Yes Pass Good Excellent
21 IT21 Good Poor Yes Pass Good Excellent
22 IT22 Good Poor Yes Pass Good Excellent
23 IT23 Good Poor Yes Pass Good Excellent
24 IT24 Good Poor Yes Pass Good Excellent
25 IT25 Poor Poor No Fail Poor Fail
26 IT26 Avg Good Yes Pass Avg Good
27 IT27 Poor Good No Fail Poor Fail
28 IT28 Good Good Yes Pass Good Excellent
29 IT29 Good Good Yes Pass Good Excellent
30 IT30 Good Good Yes Pass Good Excellent

To work out the information gain for A relative to S, we first

need to calculate the entropy of S(Grade). Here S(Grade) is a
set of 30 examples are 20“Excellent(Ex)”, 6 “Good(G)” and 4
“Fail(F)”.

Entropy(S) = - PEx log2(PEx) - PGlog2(PG) -PF log2(PF ) (1.1)

= - [20/30]log2[ 21/30] - [ 6/30]log2[ 6/30]
- [4/30]log2[ 4/30]
= 1.241946
Figure1. Presentation as rood node
To determine the best attribute for a particular node in the
tree we use the measure called Information Gain. The This process goes on until all data classified perfectly or run
information gain, Gain (S, A) of an attribute A in Table II, out of attributes. The knowledge represented by decision
relative to a collection of examples S, tree can be extracted and represented in the form of IF-
THEN rules in figure II.
Gain(S, Attendance) = Entropy(S)-│SG │Entropy(SG )
│S│ IF Presentation = ‟Good” AND Attendance
- Entropy(S)-│SAvg│Entropy(SAvg ) = ‟ Good” THEN Grade = “Excellent”
│S│ IF Presentation = ‟Average” AND Test
= ‟ Pass” THEN Grade = “Good”
- Entropy(S)-│SPoor│Entropy(SPoor ) (1.2) IF Presentation = ‟ Poor” AND Test
│S│ = ‟ Fail” THEN Grade = “Fail”
Figure2. Rule Set generated by Decision Tree
= 1.241946 - 0.1203213
V. CONCLUSIONS
= 1.1216247 A classification model has been proposed in this study for
predicting student’s grades particularly for IT under
Table II Information Gain Value Table graduate students. In this paper, the classification task is
Gain Value used on student database to predict the students division on
Gain(S, Attendance) 1.1216247 the basis of previous database. As there are many
approaches that are used for data classification, the decision
Gain(S, Aptitude) 0.234518
tree method is used here. Information’s like Attendance,
Gain(S, Assignment) 0.5665102 Class test, Aptitude, Presentation and Assignment marks
Gain(S, Test) 0.5665095 were collected from the student’s previous database, to
Gain(S, Presentation) 1.241946 predict the performance at the end of the semester.

REFERENCES
Therefore, “Presentation” attribute is the decision attribute
[1] Osmanbegovic E., Suljic M. “Data mining approach for
in the root node. “Presentation” as root node has three
predicting student performance” Economic Review-
possible values – Good, Average, Poor. as shown in figure 1.
Journal of Economics and Business. Volume 10(1)
(2012)
[2] Behrouz, M, Karshy, D, Korlemeyer G, Punch, W.
“Predicting student performance: an application of
data Mining methods with the educational web-based
system” Lon-capa. 33rd ASEE/IEEE Frontiers in
Education Conference. Boulder C.O. USA, (2003).

@ IJTSRD | Unique Paper ID – IJTSRD26545 | Volume – 3 | Issue – 5 | July - August 2019 Page 1394
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
[3] Bekele, R., Menzel, W. “A bayesian approach to predict performance: Kaduna Polytechnic experience”. African
performance of a student (BAPPS): A Case with Research Review 8(1), (2014).
Ethiopian Students”. Journal of Information Science
[7] Ogunde A.O., Ajibade D.A. “A data Mining System for
(2013).
Predicting University Students F=Graduation Grade
[4] Kovacic, Z. “Early prediction of student success: Mining Using ID3 Decision Tree approach”, Journal of
student enrollment data” Proceedings of Informing Computer Science and Information Technology,
Science & IT Education Conference. (2010). Volume 2(1) (2014).
[5] Surjeet K, Yadav, Bharadwaj, B. Pal B.” Data Mining [8] Undavia, J. N., Dolia, P. M.; Shah, N. P. “Prediction of
Applications: A comparative Study for Predicting Graduate Students for Master Degree based on Their
Student’s performance.” International journal of Past Performance using Decision Tree in Weka
innovative technology & creative engineering. Volume Environment”. International Journal of Computer
1(12). (2012). Applications; Volume 74 (21), (2013).
https://www.wekatutorial.com/
[6] Nnamani, C. N, Dikko, H. G and Kinta, L. M. “Impact of
students‟ financial strength on their academic