Using ID3 Decision Tree Algorithm To The Student Grade Analysis and Prediction
Using ID3 Decision Tree Algorithm To The Student Grade Analysis and Prediction
Volume 3 Issue 5, August 2019 Available Online: www.ijtsrd.com e-ISSN: 2456 – 6470
@ IJTSRD | Unique Paper ID – IJTSRD26545 | Volume – 3 | Issue – 5 | July - August 2019 Page 1392
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
[2] compared four different classifiers and combined the most balanced splitting. The information gain metric is such
results into a multiple classifier. Their research divided the a function.
data into three (3) different classes weighing the features
and using a genetic algorithm to minimize the error rate The basic idea of ID3 algorithm is to construct the decision
improves the prediction accuracy at least 10% in the all tree by employing a top-down, greedy search through the
cases of 2, 3 and 9-Classes. In cases where the number of given sets to test each attribute at every tree node. In order
features is low, the feature weighting worked much better to select the attribute that is most useful for classifying a
than feature selection. The successful optimization of given sets, we introduce a metric - information gain. To find
student classification in all three cases demonstrates the an optimal way to classify a learning set we need some
merits of using the LON-CAPA data to predict the students‟ function which provides the most balanced splitting. The
final grades based on their features, which are extracted information gain metric is such a function. Given a data table
from the homework data. However, the research in this case that contains attributes and class of the attributes, we can
was based on an online course as opposed to the regular measure homogeneity of the table based on the classes. The
classroom class that the present study considers. index used to measure degree of impurity is Entropy [2]. The
Entropy is calculated as follows: Splitting criteria used for
Furthermore, [3] observed that in the problem of prediction splitting of nodes of the tree is Information gain. To
of performance, it is possible to automatically predict determine the best attribute for a particular node in the tree
students’ performance. Moreover by using extensible we use the measure called Information Gain.
classification formalism such as Bayesian networks, which
was employed in their research it becomes possible to easily B. Advantage of ID3
and uniformly integrate such knowledge into the learning Understandable prediction rules are created from the
task. The researchers‟ experiments also show the need for training data.
methods aimed at predicting performance and exploring Builds the fastest tree.
more learning algorithms. Builds a short tree.
Only need to test enough attributes until all data is
Also, [8] used Iterative Dichotomiser 3 (ID3) decision tree classified.
algorithm to predict the university students‟ grade of a Finding leaf nodes enables test data to be pruned,
university in Nigeria. A prediction accuracy of 79,556 was reducing number of tests.
obtained from the model. They further suggested the use of
other decision based model to predict student’s C. Disadvantage of ID3
performance. Data may be over-fitted or over classified, if a small
sample is tested.
III. OUR PROPOSED METHOD Only one attribute at a time is tested for making a
A. The ID3 Decision Tree decision.
ID3 is a simple decision tree learning algorithm developed Classifying continuous data may be computationally
by Ross Quinlan (1983). The basic idea of ID3 algorithm is to expensive, as many trees must be generated to see
construct the decision tree by employing a top-down, greedy where to break the continuum.
search through the given sets to test each attribute at every
tree node. In order to select the attribute that is most useful IV. Data Preparation
for classifying a given sets, we introduce a metric- The first step in this paper is to collect data. It is important to
information gain. select the most suitable attributes which influence the
student performance. We have training set of 30 under
To find an optimal way to classify a learning set, what we graduate students. We were provided with a training dataset
need to do is to minimize the questions asked (i.e. consisting of information about students admitted to the
minimizing the depth of the tree). Thus, we need some first year in Table I.
function which can measure which questions provide the
@ IJTSRD | Unique Paper ID – IJTSRD26545 | Volume – 3 | Issue – 5 | July - August 2019 Page 1393
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
16 IT16 Good Good Yes Pass Good Excellent
17 IT17 Good Avg Yes Pass Good Excellent
18 IT18 Good Avg Yes Pass Good Excellent
19 IT19 Good Avg Yes Pass Good Excellent
20 IT20 Good Poor Yes Pass Good Excellent
21 IT21 Good Poor Yes Pass Good Excellent
22 IT22 Good Poor Yes Pass Good Excellent
23 IT23 Good Poor Yes Pass Good Excellent
24 IT24 Good Poor Yes Pass Good Excellent
25 IT25 Poor Poor No Fail Poor Fail
26 IT26 Avg Good Yes Pass Avg Good
27 IT27 Poor Good No Fail Poor Fail
28 IT28 Good Good Yes Pass Good Excellent
29 IT29 Good Good Yes Pass Good Excellent
30 IT30 Good Good Yes Pass Good Excellent
REFERENCES
Therefore, “Presentation” attribute is the decision attribute
[1] Osmanbegovic E., Suljic M. “Data mining approach for
in the root node. “Presentation” as root node has three
predicting student performance” Economic Review-
possible values – Good, Average, Poor. as shown in figure 1.
Journal of Economics and Business. Volume 10(1)
(2012)
[2] Behrouz, M, Karshy, D, Korlemeyer G, Punch, W.
“Predicting student performance: an application of
data Mining methods with the educational web-based
system” Lon-capa. 33rd ASEE/IEEE Frontiers in
Education Conference. Boulder C.O. USA, (2003).
@ IJTSRD | Unique Paper ID – IJTSRD26545 | Volume – 3 | Issue – 5 | July - August 2019 Page 1394
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
[3] Bekele, R., Menzel, W. “A bayesian approach to predict performance: Kaduna Polytechnic experience”. African
performance of a student (BAPPS): A Case with Research Review 8(1), (2014).
Ethiopian Students”. Journal of Information Science
[7] Ogunde A.O., Ajibade D.A. “A data Mining System for
(2013).
Predicting University Students F=Graduation Grade
[4] Kovacic, Z. “Early prediction of student success: Mining Using ID3 Decision Tree approach”, Journal of
student enrollment data” Proceedings of Informing Computer Science and Information Technology,
Science & IT Education Conference. (2010). Volume 2(1) (2014).
[5] Surjeet K, Yadav, Bharadwaj, B. Pal B.” Data Mining [8] Undavia, J. N., Dolia, P. M.; Shah, N. P. “Prediction of
Applications: A comparative Study for Predicting Graduate Students for Master Degree based on Their
Student’s performance.” International journal of Past Performance using Decision Tree in Weka
innovative technology & creative engineering. Volume Environment”. International Journal of Computer
1(12). (2012). Applications; Volume 74 (21), (2013).
https://www.wekatutorial.com/
[6] Nnamani, C. N, Dikko, H. G and Kinta, L. M. “Impact of
students‟ financial strength on their academic
@ IJTSRD | Unique Paper ID – IJTSRD26545 | Volume – 3 | Issue – 5 | July - August 2019 Page 1395