Review on Prediction Algorithms in Educational Data Mining
Review on Prediction Algorithms in Educational Data Mining
Matlab and Rapid miner tools are rarely used to predict the performance of the student by the
researchers.
Illustrate most of the research that has been applied in the weka tool for prediction.
CART and Naïve Bayes algorithms are frequently used by researchers.
Summary:
The systematic process of discovering reliable, novel, valuable, & understanding patterns from massive
and complicated data sets is known as Knowledge Discovery in Databases. Data generated by any sort
of information technology supporting learning or education in schools, colleges, universities, and other
academic or professional learning institutions is analyzed by Educational Data Mining. Predictive
approaches are used to forecast the student's performance. Several Bayes algorithms have been created,
the most important of which are Bayesian and naive Bayes. Neural networks are the greatest at spotting
patterns or trends in data and have been extensively explored for predicting student performance.
Support Vector Machines are the best when the class boundaries are nonlinear but there is too little data
to learn composite nonlinear models. K-nearest neighbours' classifiers do not build any clear universal
model but estimated it locally and implicitly.
Knowledge Discovery in Databases (KDD) is an automatic, exploratory analysis and modelling of large
data warehouses. Applying data mining techniques to educational data is a fascinating research field also
known as Educational Data Mining. EDM examines data that are generated by the educational
institution such as prediction of the student performance. The random forest algorithm gave the best
result compared to other algorithms for predicting student performance. CHAID algorithm predicted the
performance of dropout students with the highest accuracy. Shaleena and Shaiju Paul found relevant
factors and relationships that lead a student to pass or fail.
Knowledge Discovery in Databases (KDD) is an automatic, exploratory analysis and modelling of large
data warehouses. KDD is the controlled process of identifying valid, new, useful, and understandable
patterns from huge and complex data sets. Applying data mining techniques to educational data is a
fascinating research field also known as Educational Data Mining. EDM analyses data created by any
type of information system supporting learning or education in schools, colleges, universities and other
academic or professional learning institutions providing conventional teaching models and easy
learning. The random forest algorithm gave the best result compared to other algorithms for predicting
student performance.
Dinesh and R.V.Radhika investigated the performance of the students using the feature selection
method. Dr Tajuniza et al conducted a study on the factors that affect the academic achievement of the
student. AS Galathiya et al analyzed research on classification with an improved decision tree algorithm.
The classification accuracy was improved by implementing the diversities of the algorithm using RGUI
with weka packages.
In educational data mining, prediction techniques are used to predict the performance of the student.
These include classification, regression and density estimation. Classification is a two-step process: the
learning phase and the classification phase. Several Bayes algorithms have been developed, among
which Bayesian and naïve Bayes are the two essential techniques. Neural networks are best at
identifying patterns or trends in data and are well-studied for the prediction of student performance.
Support Vector Machines are the best when the class boundaries are nonlinear but there is too little data
to learn composite nonlinear models. K-nearest neighbours' classifiers do not build any clear universal
model but estimated it locally and implicitly.
There is an increasing demand for density estimation on data streams (LINK) Some density estimation
techniques are inference, pattern mining, or outlier detection. Most of the research has been done under
decision tree classification algorithms such as Bayes classification algorithm to predict the performance
of the student. This prediction helped the instructor and institution to know about the weak student status
and take proper assessment take on the student to improve their study level. Most of the research has
been applied in weka tool for prediction. Matlab and Rapid miner tools are rarely used to predict the
performance of the student by the researchers. In future this algorithm and model can be applied for
predicting student performance or new algorithms; new student variables and new data mining tools can
be also identified for better prediction based on this study.