Students Performance Prediction

Volume 8, Issue 8, August 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
Students Performance Prediction

Dona Boby1, Megha Madhu2, Ann Mary Danty3
1,2,3
Department of Mathematics, Amrita Vishwa Vidhyapeetham, Kochi, India
Abstract:- Abstract Student performance prediction is an Education has undergone a significant transformation
important aspect of education that has gained significance with the widespread application of technology in recent years.
in recent years. Predicting the academic outcomes of Technology has allowed for the development of new teaching
students can help educators identify students who are at methodologies and learning tools that cater to diverse
risk of falling behind and provide them with targeted learning styles and promote greater student engagement.
interventions to improve academic performance. New From online courses to virtual classrooms, technology has
technologies such as deep learning have revolutionized the made education more accessible and convenient than ever
way student performance prediction is done. Deep before, and it is expected to continue to play a crucial role in
learning algorithms can analyze large amounts of data the future of education. However, it is important to ensure that
and identify patterns that would be difficult to detect the integration of technology in education is done
using traditional statistical methods. In the proposed thoughtfully and with a focus on enhancing learning
study, the dataset of students in Portuguese school outcomes, rather than simply replacing traditional teaching
contains various features such as age, gender, family methods.
background, study time, travel time, weekly study time,
etc. The deep learning techniques employed in this study Educational data mining is a field that involves
include Artificial Neural Networks (ANN), Long Short- analyzing large sets of educational data to identify patterns,
Term Memory (LSTM), Convolutional Neural trends, and insights that can inform instructional decision-
Network(CNN) and Bi-directional LSTM. The making [2]. This includes data from a range of sources, such
performance of these deep learning models was evaluated as student assessments, attendance records, and demographic
using metrics such as accuracy, mean squared error information. By analyzing this data, educators can gain a
(MSE), and mean absolute error (MAE). This study better understanding of student performance and behavior,
demonstrates the effectiveness of deep learning and develop targeted interventions to improve outcomes. Our
techniques in predicting student performance and can be study demonstrates that the use of deep learning models can
used as a basis for developing interventions to improve significantly improve the accuracy of predicting students’
academic outcomes. academic performance compared to traditional statistical
models.
Keywords:- Deep Learning;Academic Performance;Early
Prediction. II. RELATED WORK
I. INTRODUCTION Over the past few years, there have been numerous
studies aimed at discovering various patterns and strategies
Education plays a crucial role in shaping the future of that can enhance students academic performance. Different
children and society as a whole. It provides children with the methodologies and tools are used to visualize and analyze the
knowledge and skills they need to succeed in life and make data. Some of related works that have been done so far
positive contributions to their communities. Education also discussed on this section:
helps children develop critical thinking, problemsolving, and
decision-making abilities that will serve them well A. Nabil, M. Seyam and A. Abou-Elfetouh [3]
throughout their lives. Furthermore, education can promote presented a study to explore the efficiency of deep learning in
social mobility, reduce poverty and inequality, and promote the field of Educational Data Mining, for predicting students
economic growth and development [1] . Overall, investing in academic performance, inorder to identify the students who
education for children is essential for building a brighter, are at a risk of failure. The study used a public 4-year
more prosperous, and equitable future for all. university dataset and developed predictive models to
forecast students’ performance in upcoming courses based on
Education is the process of acquiring knowledge, skills, their grades in the previous courses of the first academic year.
values, and attitudes through various methods such as Various models, including a deep neural network (DNN),
teaching, training, research, and practical experience. decision tree, random forest, gradient boosting, logistic
Education can take place in formal settings such as schools regression, support vector classifier, and K-nearest neighbor,
and universities, as well as informal settings such as were employed for this purpose. The DNN model proposed
workplaces and homes. It is not just about memorizing facts in the study exhibited a high accuracy of 89 %, outperforming
and figures, but also about developing social skills, creativity. other models such as decision tree, logistic regression,
Education helps individuals to broaden their perspectives and support vector classifier, and K-nearest neighbor. The model
understand diverse cultures, which in turn promotes tolerance was able to predict students’ performance in the course data
and mutual respect. Moreover, education is a fundamental structure and thus helped to identify those students at the risk
human right that should be accessible to all, regardless of of failure in an early stage of a semester.A recent study
their background, gender, or socioeconomic status. focused on predicting student performance using the
attention-based Bidirectional Long Short-Term Memory
IJISRT23AUG770 www.ijisrt.com 1494

ISSN No:-2456-2165
(BiLSTM) network [4]. . They incorporated advanced feature overall performance and aspirations as the most important
classification and prediction techniques and found that the predictors of student-teacher attrition.
combination of BiLSTM and attention mechanism resulted in
superior performance and achieved an accuracy of 90.16 %. A recent research project aimed to develop a classifier
that could predict the academic performance of computer
M. R. Islam Rifat, A. S. M. Badrudduza and A. Al Imran science students at Al-Muthanna University’s College of
[5] have proposed a Deep Neural Network (DNN) based Humanities (MU) [10] The study employed several machine
model to predict the final CGPA of undergraduate business learning techniques, including Naïve Bayes, Logistic
students. They collected a real dataset by gathering transcript Regression, Artificial Neural Network, and Decision Tree, to
data from the marketing department of a reputable public build predictive models. The models were compared using
university in Bangladesh. The dataset consisted of transcript performance measures such as the ROC index and accuracy,
data from students who graduated in 2013, 2014, 2015, and while different metrics such as the F measure, classification
2016. The researchers compared their proposed model’s error, recall, and precision were computed. To build the
performance to a baseline decision tree model. The results models, the researchers used a dataset that combined
showed that their DNN-based model significantly information from a survey administered to the students and
outperformed the baseline model. The proposed model the students’ grade book. The ANN model achieved the best
reduced the mean squared error (MSE), mean absolute error performance that is equal to 0.807 and achieved the best
(MAE), and mean absolute percentage error (MAPE) by accuracy that is equal to 77.04 %.
0.0146, 0.0431, and 6.043, respectively, compared to the
baseline model. H. M. R. Hasan, A. S. A. Rabby, M. T. Islam and S. A.
Hossain [11] presented a study that aimed students
Similarly, another study for finding Student Learning performance prediction using Machine Learning
Outcomes in Learning Management Systems [6]. By Algorithms.For training they have 1170 students’ data of
combining two methods; namely, CNN to extract effective three subjects. They have collected the data from the students
features from the data, and LSTM to identify the record. Finally they were able to get the most perfect result
interdependence of data in time series data, the performance and accuracy with K-Nearest Neighbors, Decision Treee
prediction accuracy was improved compared with state-of- Classifier model with an accuracy of 89.74 % and 94.44 %.
the-art methods whereas testing data accuracy is 94.3 %.
III. METHODOLOGY
A study in which an undergraduate database was used
which is obtained from the Student Advisory and Support A. Dataset Description
Center at Bina Nusantara University that comprises a total of The dataset used here is the Student Performance Dataset
46,670 students enrolled from 2010 to 2017. The study from a Portuguese secondary school. It contains information
introduced a dual-input deep learning model [7], capable of on students’ personal and academic backgrounds as well as
concurrently analyzing time-series and tabular data to their performance in a subject: Portuguese.
forecast student GPA. The model proposed attained the most
exceptional results among all assessed models, achieving a The dataset includes 33 variables, such as the student’s
GPA prediction with a 4.0 scale, featuring a 0.4142 Mean age, gender, family size, parents’ education, travel time, study
Squared Error (MSE) and a 0.418 Mean Absolute Error time, previous failures, weekly study time, weekly alcohol
(MAE). consumption, whether the student has internet access at
home, and their final grades in the course.
A two-year long analysis of student learning data from
the University of Hail was conducted for the study [8].The The dataset was collected from 2005 to 2006, and it
bidirectional long short term model (BiLSTM), was utilized includes 649 instances. The purpose of this dataset is to
to investigate students whose retention was at risk. The model predict the final grade (G3) of students based on their
has diverse features which can be utilized to assess how new personal and academic background.
students will perform and thus aiding in the timely prediction B. Data Preprocessing
of student retention and dropout. The method of using Data preprocessing is a crucial step in the data analysis
Conditional Random Fields (CRF) for sequence labeling was pipeline that involves cleaning, transforming, and preparing
employed to make independent predictions for each label of raw data for further analysis. The following steps are taken to
the students. prediction of student retention was possible with preprocess the data:
an accuracy of over 0.85 in most scenarios and with FP rates
ranging from 0.05 to 0.10 in most cases. One-hot encode categorical variables using Pandas’
get_dummies() function. Encode binary variables using
In a study conducted by H. N. Alhulail and H. P. Singh, scikit-learn’s Label Encoder class. Normalize numeric
the primary objective of the research was to develop a model variables using scikit-learn’s Standard Scaler class.
that maximizes predictability and enables the early Normalization scales the numeric variables so that they have
identification of at-risk student-teachers [9]. The researchers a mean of 0 and a standard deviation of 1. The final
used questionnaires and a four-step logistic regression preprocessed data consists of the one-hot encoded categorical
procedure to analyze a sample of 1723 student-teachers variables, the label encoded binary variables, and the
enrolled in public teacher training colleges (TTCs) in a least- normalized numeric variables. The preprocessed data is split
developed country (LDC). The study recognized instructional into training and testing sets using a 80:20 ratio.

ISSN No:-2456-2165
Overall, the data preprocessing steps ensure that the corresponds to a single input feature. The model is built using
data is in a format that is suitable for analysis and prediction the Keras API in TensorFlow, which includes one LSTM
using deep learning algorithms. This helps to improve the layer followed by two fully connected layers with dropout to
correctness of the predictions and increases the confidence in prevent overfitting. The model is then compiled using mean
the results. squared error loss and the Adam optimizer. The model is
trained on the training set and tested on the testing set. The
C. Artificial Neural Network(ANN) predictions are converted to binary outcomes indicating
ANN model is important in student performance whether the student passes or fails the course based on a
prediction task because it can capture complex non-linear threshold score of 10. Finally, the accuracy of the model is
relationships between the input variables and the output calculated, and a scatter plot is generated to compare the
variable. actual and predicted test scores.It obtained an accuracy of
0.876 and the records of MSE is 1.92 and MAE is 0. 97.
The input layer receives data, which is then passed
through one or more hidden layers before reaching the output E. Convolutional Neural Network(CNN)
layer. Each neuron in the network receives inputs from other CNNs are useful for analyzing sequential data because
neurons [12] ,performs a mathematical computation on the they are able to learn patterns and features that are invariant
inputs, and produces an output that is passed to other neurons to location.They can detect features regardless of where they
in the next layer. occur in the input sequence. This is done through the use of
convolutional layers [14], which apply filters to the input
The weights and biases of the connections between sequence to detect patterns and features. MaxPooling layers
neurons are adjusted during training, using an algorithm such are also used to downsample the output of the convolutional
as back propagation, to optimize the network’s performance layers, which helps to reduce the number of parameters and
on a given task, such as classification, regression, or pattern prevent overfitting.
recognition.
A Convolutional Neural Network (CNN) is used for
In this study, the main use of ANN is to predict the final regression to predict the final grade (G3) of high school
grades (G3) of high school students based on various input students based on various features. CNNs are commonly used
features such as demographics, academic performance, and for image classification tasks, but they can also be used for
family background. time series analysis, which is the case here since the input
Initially, preprocesses the data by encoding categorical data is reshaped into a 3D array with dimensions (number of
variables, normalizing numeric variables, and splitting the samples, number of features, 1).
data into training and testing sets. The ANN model is then In this study, the CNN model consists of two Conv1D
built using Keras, which consists of an input layer, two hidden layers, each followed by a MaxPooling1D layer, and then two
layers with dropout regularization, and an output layer. The fully connected (Dense) layers with Dropout to prevent
model is compiled with a mean squared error loss function overfitting. The model is trained using mean squared error as
and an Adam optimizer, and then trained on the training set the loss function and the Adam optimizer. After training, the
for 100 epochs with a batch size of 16. model is used to predict the final grades of the test set, and
After training, the model is used to make predictions on the mean squared error and mean absolute error are calculated
the test set, which are then compared to the actual test scores. to evaluate the performance of the model. Finally, a scatter
The accuracy of the model is calculated by comparing the plot is used to compare the actual test scores to the predicted
predicted binary outcomes (pass/fail) to the actual binary test scores.It obtained an accuracy of 0.853 and the records of
outcomes, and the mean squared error and mean absolute MSE is 2.1 and MAE is 1.09.
error are also calculated. Finally, a scatter plot is generated to F. Bi-Directional LSTM
visualize the comparison between the actual and predicted A BiLSTM is a type of Recurrent Neural Network (RNN)
test scores.It obtained an accuracy of 0.846 and the records of that processes the input sequence in both forward and
MSE is 2.19 and MAE is 1.1. backward directions. This means that the model can learn
D. Long Short Term Memory(LSTM) from both the past and the future context of the input
LSTM works by selectively remembering or forgetting sequence [15] ,which can be helpful for tasks such as
information from previous inputs using specialized units sequence prediction and classification.
called LSTM cells, which allows it to maintain a long-term In this case, the BiLSTM layer is used to process the
memory of past [13] inputs and make accurate predictions input features of the student data, which are represented as a
based on that memory. sequence of values for each student. The output of the
In this study trains an LSTM (Long Short-Term BiLSTM layer is then passed through two additional Dense
Memory) model on student performance data to predict their layers with dropout regularization to predict the final test
final grade in a course. The input data is preprocessed and score.
normalized using one-hot encoding, label encoding, and The use of a BiLSTM layer in this model can help to
standard scaling.The input features are reshaped to fit the capture the complex relationships and patterns in the input
LSTM input format of (samples, timesteps, features), where sequence, and potentially improve the accuracy of the
each sample corresponds to a single student and each timestep predictions. However, it’s important to note that the

ISSN No:-2456-2165
effectiveness of the BiLSTM layer will depend on the specific IV. RESULT
task and the characteristics of the input data.The model is
compiled with mean squared error loss and the Adam The bar chart displays the accuracy scores of four
optimizer. It is then trained on the training set for 100 epochs different algorithms - Artificial Neural Network (ANN),
with a batch size of 16. After training, the model predicts final Long Short-Term Memory (LSTM), Convolutional Neural
grades of the test set.It obtained an accuracy of 0.823 and the Network (CNN), and Bidirectional LSTM (BiLSTM) for
records of MSE is 2.64 and MAE is 1.2. predicting students performance.
The results show that LSTM achieved the highest

accuracy score of 0.876, followed by CNN with an accuracy
score of 0.853, ANN with an accuracy score of 0.846, and
BiLSTM with an accuracy score of 0.823.
Fig. 1: Bar Chart
LSTM achieved the highest accuracy score, followed by records, which may provide more information for accurate
CNN, ANN, and BiLSTM.This may be attributed to the fact prediction.
that the LSTM algorithm is able to capture long-term
dependencies in sequential data, such as student performance This helps in visualizing the accuracy scores of different
deep learning algorithms for predicting students
performance.
Fig. 2: ANN Fig. 3: LSTM
Fig. 4: CNN Fig. 5: Bi-LSTM

ISSN No:-2456-2165
The scatter plots above compares the actual test scores Fhad Mohammed Alogali, Yaser Mohammed
with the predicted test scores. The diagonal line represents Altameemi, Mohammed Altamimi, and Paul Kwan. A
perfect predictions, and the plot helps to visualize the deep learning model to predict student learning
performance of the models in predicting the test scores. outcomes in lms using cnn and lstm. IEEE Access,
10:85255–85265, 2022.
V. CONCLUSION [7.] Harjanto Prabowo, Alam Ahmad Hidayat, Tjeng
Wawan Cenggoro, Reza Rahutomo, Kartika
In the study, we have used Deep Learning model to Purwandari, and Bens Pardamean. Aggregating time
predict the performance of students using the Portuguese series and tabular data in deep learning model for
course grades data set.We have used four deep learning university students’ gpa prediction. IEEE Access,
algorithms: 9:87370–87377, 2021.
ANN,LSTM,CNN,BiLSTM.Based on our project [8.] Diaa Uliyan, Abdulaziz Salamah Aljaloud, Adel
results, the LSTM algorithm achieved the highest accuracy Alkhalil, Hanan Salem Al Amer, Magdy Abd Elrhman
score of 0.876, making it the best-performing algorithm Abdallah Mohamed, and Azizah Fhad Mohammed
among the four deep learning models used for student Alogali. Deep learning model to predict students
performance prediction. retention using blstm and crf. IEEE Access, 9:135550–
135558, 2021.
This study can help teachers and school administrators [9.] Harman Preet Singh and Hilal Nafil Alhulail.
identify students who may need additional support or Predicting student-teachers dropout risk and early
resources to improve their academic performance. identification: A four-step logistic regression
Additionally, the model can provide insights into which approach. IEEE Access, 10:6470–6482, 2022.
factors are most strongly associated with student [10.] Hussein Altabrawee, Osama Abdul Jaleel Ali, and
performance, which can inform educational policies and Samir Qaisar Ajmi. Predicting students’ performance
interventions. using machine learning techniques. JOURNAL OF
UNIVERSITY OF BABYLON for pure and applied
There are several ways in which the above study can be sciences, 27(1):194–205, 2019.
improved. These include collecting more data to improve the [11.] HM Rafi Hasan, AKM Shahariar Azad Rabby,
model’s accuracy and generalization, performing feature Mohammad Touhidul Islam, and Syed Akhter Hossain.
engineering to transform the existing features into more Machine learning algorithm for student’s performance
meaningful ones, using ensemble learning techniques to prediction. In 2019 10th International Conference on
improve the model’s accuracy and robustness and adding Computing, Communication and Networking
more complex layers to capture more complex patterns in the Technologies (ICCCNT), pages 1–7. IEEE, 2019.
data. [12.] George Cybenko. Approximation by superpositions of
a sigmoidal function. Mathematics of control, signals
REFERENCES and systems, 2(4):303–314, 1989.
[13.] Sepp Hochreiter and Jürgen Schmidhuber. Long short-
[1.] David Ashton, Francis Green, et al. Education, term memory. Neural computation, 9(8):1735–1780,
training and the global economy. Edward Elgar 1997.
Cheltenham, 1996. [14.] Matthew D Zeiler and Rob Fergus. Visualizing and
[2.] Cristóbal Romero and Sebastián Ventura. Educational understanding convolutional networks. In Computer
data mining: a review of the state of the art. IEEE Vision–ECCV 2014: 13th European Conference,
Transactions on Systems, Man, and Cybernetics, Part Zurich, Switzerland, September 6-12, 2014,
C (applications and reviews), 40(6):601–618, 2010. Proceedings, Part I 13, pages 818–833. Springer,
[3.] Aya Nabil, Mohammed Seyam, and Ahmed Abou- 2014.
Elfetouh. Prediction of students’ academic [15.] Zhiheng Huang, Wei Xu, and Kai Yu. Bidirectional
performance based on courses’ grades using deep lstm-crf models for sequence tagging.
neural networks. IEEE Access, 9:140731– 140746, [16.] arXiv preprint arXiv:1508.01991, 2015.
2021.
[4.] Bashir Khan Yousafzai, Sher Afzal Khan, Taj Rahman,
Inayat Khan, Inam Ullah, Ateeq Ur Rehman,
Mohammed Baz, Habib Hamam, and Omar
Cheikhrouhou. Studentperformulator: student
academic performance using hybrid deep neural
network. Sustainability, 13(17):9775, 2021.
[5.] Md Rifatul Islam Rifat, Abdullah Al Imran, and ASM
Badrudduza. Edunet: a deep neural network approach
for predicting cgpa of undergraduate students. In 2019
1st International Conference on Advances in Science,
Engineering and Robotics Technology (ICASERT),
pages 1–6. IEEE, 2019.
[6.] Abdulaziz Salamah Aljaloud, Diaa Mohammed
Uliyan, Adel Alkhalil, Magdy Abd Elrhman, Azizah

Students Performance Prediction

Uploaded by

Copyright:

Available Formats

Students Performance Prediction

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Students Performance Prediction

Uploaded by

Copyright:

Available Formats

Volume 8, Issue 8, August 2023 International Journal of Innovative Science and Research Technology

Students Performance Prediction

IJISRT23AUG770 www.ijisrt.com 1494

IJISRT23AUG770 www.ijisrt.com 1495

IJISRT23AUG770 www.ijisrt.com 1496

The results show that LSTM achieved the highest

Fig. 1: Bar Chart

Fig. 2: ANN Fig. 3: LSTM

Fig. 4: CNN Fig. 5: Bi-LSTM

IJISRT23AUG770 www.ijisrt.com 1497

IJISRT23AUG770 www.ijisrt.com 1498

You might also like