This document discusses using machine learning to predict student performance in online learning environments. It reviews studies that have examined online course data to predict student outcomes using machine learning techniques. The studies identified features of online courses used for prediction, outputs of prediction models, methodologies for feature extraction, evaluation metrics, and challenges. Machine learning algorithms commonly used in the studies include logistic regression, naive Bayes, decision trees, AdaBoost, k-nearest neighbor, and neural networks. The document provides an in-depth analysis of different machine learning models and their effectiveness in predicting student certificate acquisition, grades, and students at risk of failure.
2. 2
Int. J. Mech. Eng. Res. & Tech 2023
ISSN 2454 – 535X www.ijmert.com
Vol. 11, Issue.3, July 2023
Using Machine Learning To Predict Student Performance In Online
Learning Environments
By
Raunak Bedi
Abstract
Prediction of academic performance of students is one of the major topics for universities and
schools as it can be helpful to design the right mechanisms to avoid dropouts and improve
academic results, among others. A lot of processes have been automated in usual activities of
students to benefit them and manage big data gathered from software products for tech-based
learning. Hence, processing and analyzing the same data properly can give a lot of vital insights
to their knowledge and relation between students and their homework. This information can feed
promising methods and algorithms for prediction of student performance.
This study is conducted to analyze various machine learning models used for predicting student’s
performance. This study presents an in-depth review of studies examining data of online learning
environments to predict students’ outcomes with machine learning techniques. This study will help
identify the online course features used for prediction of learners’ outcome, determine the outputs
of prediction, strategies, and methodologies of feature extraction for prediction of performance,
evaluation metrics, and challenges and limitations for analyzing the outcomes.
Keywords: academic performance, student performance, online learning, machine learning
algorithms, machine learning models, online learning environments
Introduction
The way people used to learn has been revolutionized by online learning and education has never
been so convenient and affordable to billions of people worldwide. Irrespective of rising interest
and benefits of distance and online learning, universities and schools are highly concerned about
students’ retention and academic performance, along with low degree/certification completion
rates and high dropout rates.
Shalom Presidency School
A-199 1st
floor sector 55 Gurgaon, Haryana,122011, India
bedi.raunak@gmail.com
Phone No=+91 9354980255
Int. J. Mech. Eng. Res. & Tech 20221
3. 3
Dropping out or failing an online program
or course is usually an important parameter
to assess course/program quality and
allocate resources by institutional
authorities.
Low certification and dropout rates are also
a major risk factor to profitability, funding,
and reputation of an institution (Arce et al,
2015). These outcomes have vast impact
on well-being, self-esteem, odds of
graduating, and employment of students
(Arce et al., 2015; Larusson & White,2014).
Hence, it is important to find more efficient
methods to forecast performance of
students as early as possible for students,
educators, and institutions to take necessary
measures for improving online learning
experiences of students and building
intervention strategies to meet the needs of
students. With rising interest of online
learning and big data produced by students
by interacting with online learning
environments, several machine learning
methods are proposed by researchers to
predict students’ performance and improve
their learning outcomes.
1.1 Background
Machine learning is applied in different
areas. For example, machine learning is
used in search engines to create relations
between webpages and search terms.
Search engines scan the website’s content
and define the most important terms and
words to define a specific webpage and use
the same information to return the most
relevant information to a given search term
(Witten et al., 2016). Machine learning is
also used in image recognition to identify
specific objects like faces (Alpaydin,
2020). Initially, machine learning model
analyzes images with a specific object. If
enough images are given for processing, the
algorithm can determine whether an image
has object(Watt et al, 2020).
Additionally, machine learning can
understand the products a customer might
like. After analyzing the previous products,
the system suggests new product that might
be interesting to the customer (Witten et al,
2016). All such examples have similar
principle. The data is processed by the
system and this data can be identified, and
then this knowledge is used to make future
decisions. The rise in data has been
effective to make such applications more
effective. Machine learning is categorized
into supervised and unsupervised learning
as per the type of input. Input data belongs
to a common class structure in supervised
learning (Mitchell, 2007; Kumar et al,
2012). This input data is called training
data. The algorithm is basically aimed to
create a prediction model for predicting a
property with other properties. After
creating the model, it processes data with
similarclass structure to input data. There is
no known class structure in input data and
algorithm is aimed to reveal the data
structure in unsupervised learning
(Mitchell, 1997; Sugiyama, 2015).
Literature Reviews
Student retention is said to be a major
concern in education. Though intervention
strategies can improve retention rates, prior
knowledge of students’ performance for
those programs (Yadav et al., 2012). This
is where it is important to predict student
performance. Using machine learning for
performance prediction or dropout is a
common pattern in academic studies. In
online learning, dropout prediction is a very
common concern in those studies because
of both easy availability of data and high
4. 4
dropout rates (Kalles and Pierrakeas,
2006). Areas out of online learning are
major contexts where performance or
dropout predictions are widely used for
research purposes. Purpose of these studies
varies. Finding the best prediction
approach is important in some studies.
Some studies are aimed to determine the
viability of machine learning to predict
student performance ordropout.
A study was conducted at the “Eindhoven
University of Technology” to determine
the effectiveness of machine learning to
predict students’ dropout rate (Dekker et
al., 2009). Building various prediction
models with various machine learning
approaches like Logit, BayesNet and
CART are the basic methodology here.
Then, they compared prediction outcomes
of various models in terms of effectiveness.
J48 classifier has successfully built the
most efficient model. A group of
researchers from three Indian universities
have conducted a similar study. They
analyzed the dataset of university students
using various algorithms. They compared
the recall value and precision later. They
got most accurate results with the ADT
decision tree algorithm (Yadav et al.,2012).
However, prediction of performance of
students rather than dropouts is more
relevant with this study. There are also
some studies which have predicted
students’ performance. In “Hellenic Open
University”, a study was done to analyze
the use of ML in distance education by
Kalles & Pierrakeas (2006). They used
decision trees and genetic algorithms to
come up with a predictive model and
compared the results for accuracy. The
“Genetically Evolved Decision Trees
(GATREE)” model has provided most
accurate results.
Amrieh et al (2016) conducted a study on
predicting performance at the “University
of Jordan”. They used a dataset of students
from various nations. Along with separate
machine learning models, they also used
ensembled techniques and compared the
results. They found best result with
decision trees. Behavioral features were the
other area focused by the researchers. They
created a model by taking these features or
without them. The prediction results were
improved by including these behavioral
features.
Cortez & Silva (2008) conducted a study
on performance prediction at the
“University of Minho, Portugal”. The
dataset had information on whether exam
has been passed by the students in
Portuguese language and Mathematics.
They used ML models like random forest,
decision trees, support vector machines,
and neural networks and compared them
for accuracy. They also compared a dataset
with previous exam results and the one
which didn’t have past grades.
Performance was improved by adding
previous results.
2.1 Research Gap
E-learning has become a common form of
education and a vital part of development
of online education (Giannakos & Vlamos,
2013). E-learning has become a common
phenomenon because of the impact of
COVID-19 across the world because of its
spatial flexibility, high temporal, rich
education materials, and low learning
curve. However, teachers cannot perceive
learning status of the students easily in this
mode (Qu et al., 2019). Hence, it raises
concerns over the quality of online
learning. Hence, this study is based on
education, i.e., by predicting student
performance. It also evaluates
effectiveness of various machine learning
approaches. Recall, precision, F-
5. 5
measure, etc. are some of the most
common indicators to determine the
effectiveness of machine learning models
(Powers, 2020).
2.2 Research Question
What are the effective machine learning
models to predict students’ performance
in e-learning?
2.3 Research Objectives
To predict student’s performance in
online learning with machine learning
To evaluate various machine learning
models for performance prediction
Research Methodology
The study of performance prediction
provides a foundation to teachers for
adjusting their approaches of teaching for
students who may have trouble by
predicting their performance on upcoming
exams. It can help reduce the risk of failure
during the course and improve the quality
of online education. With a lot of empirical
studies investigating the relation between
learning performance and e-learning
behavior, the e-learning behavior plays a
vital role on learning outcome. Research
community has focused a lot on prediction
of learning performance on the basis of
data on learning process in recent years.
Teachers can modify their teaching
strategies over time with the help of
collection, analysis, and measurement of
data on learning process to predict learning
performance and start using early warning
and supervision during the learning
processes.
This study is qualitative in nature as around
70 recent studies have been selected related
to various machine learning techniques to
predict student performance (out of which
90% of the studies were published over the
past two decades). The studies selected for
this paper are published in various journals,
conferences, and book chapters. The
corresponding literature has been extracted
from various online databases like IEEE,
Springer, Science Direct, ACM Digital
Library, iJET, Sage Journals, Wiley Online
Library, Google Scholar, etc.
6. 6
Exclusion criteria for this study includes
papers without proper contribution or
having lack of quality or clarity. Research
papers without impact factor or not peer-
reviewed have been excluded. The papers
published in conferences not supported or
published by Springer, IEEE, ACM or other
reputed editorials or organizations were
excluded as well.
Analysis of Study
There are various reasons researchers have
learned students’ learning behavior and
characteristics, such as understanding
learning styles of students, building
students’ profiles to provide personalized
learning, and boosting educational
outcomes. They have investigated different
outcomes which can be grouped into
completion and retention predictions as
well as performance predictions as
retention and success rate are important
variables to assess and measure constantly
in institutions. The performance prediction
is done to predict the efficiency of students
on completing the given course like grade
value, grade category, failure/success,
certificate, and risk prediction. A lot of
researchers have studied “student dropout
prediction” in this field.
4.1. Predicting Student’s
Performance in Online
Learning There are three
performance measures examined in
recent studies for prediction of student
performance – certificate acquisition, grade
prediction, and student at-risk prediction.
4.1.1. Certificate Acquisition
Certificate acquisition of student is one of
the best measures to predict performance of
students in professional courses. In this
category, all the studies have predicted
achievement of certificate of completion.
Hence, achieving course completion and
certificate prediction are the same. A
“certificate prediction model” has been
developed by Kórösi et al. (2018) on the
basis of features associated with mouse
behavior, learning behavior, text inputs,
and video-watching attributes. Various
classification models are used to implement
the “gain-ratio feature selection” method
and bagging and random forest are used to
obtain the best performance.
In the same way, “Al-Shabandar et al.
(2017)” have proposed hill climbing and
random forest model to choose statistical
features automatically from behavioral and
demographical attributes of students. The
“HarvardX (2014)” dataset was used to
under-sample majority class to solve the
problem related to class imbalance. A
“deep neural network” model has been
proposed by Imran et al (2019) for the
prediction of certificate acquisition and
student dropout on the basis of learner
behavioral data.
An approach has been proposed by Liang
et al. (2017) to provide personalized profile
to boost e- learning to manage student
behavior. They explored various
classification models for prediction of
whether a students will get a certificate on
the basis of “Jaccard coefficient similarity”
of learning behavior and student profiles.
They tested the model over 5 to 7 weeks
and obtained bestresults at 7 weeks with the
SVM model. An approach has been
proposed by Ruipérez-Valiente etal. (2017)
which uses “statistical learning behavior”
and progress features with various
approaches of machine learning to predict
whether a certificate will be given to a
student or not.
4.1.2. Grades
It is yet another perspective on success for
any student. A lot of publications have
proposed prediction models for grades of
students in different assessments like final-
7. 7
exam grades, course grades, assignments,
or quizzes in the course. Researchers have
tested regression and classification models
to predict grades. Binary classification has
been used in various studies for prediction
of success and failure of students (Huang et
al, 2020; Liu et al., 2017; Wan et al., 2019;
Lemay & Doleck, 2020; Kokoç et al.,
2021). Various machine learning models
have been examined in mainstream studies
with manually selected features of
statistical learning behavior to predict
grades of students. Xiao et al. (2018) used
“classification and regression tree
(CART)” model on learning behavior data
and statistical demographic for the
classification of final grades of studentsinto
multiple classes. Additionally, a machine
learning model has been proposed by
Villagrá- Arnedo et al. (2016) which
used normalized behavior data for
classification of students in 5 categories
of grades. Here, “support vector machine
(SVM)” model performed better than other
baseline models.
4.1.3. Predicting Students-at-
risk
Along with predicting performance, it is
also important to identify students who
might be failed in a course. Kondo et al
(2017) identified at-risk and off-task
students using “attendance attributes” and
“learning interaction” of “SPOCs dataset”.
Cano & Leonard (2019) proposed a
“genetic programming (GP)” based “early
warning system to address socioeconomic
disadvantages of students. It extracts
classification rules automatically on the
basis of learning interaction, student
demographic, registration, academic
background, and socioeconomic and family
data.
A transfer-based model has been proposed
by Wan et al. (2019) to identify “at-risk
students” on a class test which is held every
week. They extracted behavioral and
statistical features like % of total accurate
answers and total time invested on video
resources from data and combined the same
with weights which were previously
learned from former courses for prediction
of “students-at- risk” and protecting them
from failing in existing course. A model has
been proposed by “El Aouifi et al (2021)”
for prediction of final grade on the basis of
interactions of students with pedagogical
sequences of behaviors related to
educational video like jump forward, play,
pause, jump back, and end. They fathered
sequences as features as per educational
sequences. They used MLP and K-NN to
achieve best results.
4.2. Machine Learning
Models for
Performance
Prediction
A huge range of deep learning and machine
learning techniques have been used by
researchers forprediction of outcomes of e-
learning from the data of students’ online
interaction. In this field, the machine
learning models are categorized into linear
models like “logistic regression”,
probabilistic (Naïve Bayes), tree-based
models like “decision tree (DT)”, ensemble
model like AdaBoost, linear model like
“logistic regression (LR)”, instance-based
model like “K-nearest neighbor (kNN)”,
and rule-based models like “fuzzy logic”
approaches. Additionally, some studies
have used heuristic or optimization
methods like K-star model for prediction of
outcomes.
Recent studies have investigated deep
learning-based models like “long short-
term memory (LSTM)” and “convolutional
neural networks (CNNs)” to predict e-
learning outcomes. Table 1 lists all the
models based on deep learning and
machine learning used in earlier studies.
Figure 1 illustrates the categories of
predictive models in relevant literature.
Deep learning models are used most
8. 8
commonly in recent studies. It is also worth
considering that “deep learning” is the term
which refers to models implementing neural
network with over 3 layers. Hence, studies
developing the “feed-forward neural
network” with over 3 hidden layers are
considered as deep learning.
Category Machine Learning Models
Linear Models “Logistic regression (LR), support vector machine (SVM), linear
discriminant analysis (LDA), generalized linear model (GLM), lasso
linear regression (LLG), boosted logistic regression”
“Probabilistic” “Naive Bayes, Bayes network, Bayesian generalized linear (BGL),
Bayesian belief networks”
“Tree-based models” “Decision tree (DT), random forest (RF), Bayesian additive regression
trees (BART)”
Instance-based “k-nearest neighbors (kNN)”
Rule-Based “Rule-based classifier (JRip), fuzzy set rules”
Neural network “Multilayer perceptron(MLP) or artificial neural network (ANN)”
Deep neural network “Recurrent neural network (RNN), gated recurrent unit (GRU), long
short-term memory (LSTM), convolutional neural network (CNN),
squeeze-and-excitation networks (SE-net)”
Table 1 – Machine learning models and their categories used in previous studies
Figure 1 – Machine learning models used in previous studies
Source – Alhothali et al. (2022)
Recommendation systems gather data on user choices for a range of elements like
9. 9
websites, books, applications, travel
destinations, e-learning, etc. In context of
performance of students, it is possible to
acquire explicit information or implicit
information by gathering scores and
monitoring their behavior like materials
downloaded and visiting study materials
(Bobadilla et al, 2013). Recommender
systems consider various data sources for
predictions. They manage factors like
novelty, precision, stability, and dispersion.
4.2.1. Collaborative Filtering
It plays a vital role in prediction, even
though it is used with other filtering
methods like knowledge-based, content-
based, or social approaches (Bobadilla et
al, 2013). Just like decisions are based as
per previous knowledge and experiences,
collaborative filtering is used to perform
prediction. Some studies have predicted
various issues regarding performance of
students with collaborative filtering.
Hence, Bydžovská (2015) found
similarities between students where their
knowledge was represented as a range of
grades from earlier courses.
4.2.2. “Artificial Neural Networks (ANN)”
ANN includes a range of entities which are
highly connected internally like
“Processing Elements”. The function and
structure of the network is inspired by
human brain. Each element of processing
is designed to work like its neuron, which
accepts weighted inputs and responds by
giving output (Adewale et al., 2018). ANN
is applied in various prediction models,
usually by taking in evaluation outcomes of
students. The scores of tests were predicted
with a feedforward ANN by taking in
partial scores (Gedeon & Turner, 1993).
The academic performance is predicted by
the ANN which uses “Cumulative Grade
Point Average” in 8th
semester (Arsad &
Buniyamin, 2013). Researchers compared
two ANN models, i.e., “Generalized
Regression Neural Network” and
“Multilayer Perceptron” to identify the best
prediction model for academic
performance (Iyanda et al., 2018). Finally,
ANN’s potential for prediction of learning
outcomes is compared to “multivariate LR
model” in medical education
(Dharmasaroja & Kingkaew, 2016).
Results
Application of techniques like
collaborative filtering, machine learning,
artificial neural network and recommender
systems can consider different types of
information for students’ behavioral
prediction, for example, tasks’ grades and
demographic characteristics. A study by
the “Hellenic Open University” is a good
point to start, where researchers applied
various supervised and
10. 10
machine learning models to a specific
dataset. In this study, it is observed that
“Naïve Bayes” model was the best
algorithm to predict both probability and
performance of dropout of students
(Kotsiantis et al., 2004).
Machine learning consists of techniques
which enables computers to learn without
human intervention (Navamani &
Kannammal, 2015). Machine learning has
helped in different applications like stock
market analysis, medical diagnostics,
classification of DNA sequence, robotics,
games, and predictive analysis. Rastrollo-
Guerrero et al. (2020) focused on
predictive analysis where they implemented
complex models for prediction. These
approaches can be helpful to promote
decision-making with relevant data.
Supervised learning promotes models to
reason from cases which are supplied
externally to generate hypothesis which
can make predictions on future events
(Kotsiantis, 2007).
All in all, supervised learning is mainly
aimed to come up with a clear distribution
model of class labels in predictor features.
Rule indication is an excellent supervised
learning approach to make predictions
which can reach 94% accuracy level when
it comes to predict new student dropouts in
nursing (Moseley & Mead, 2008). It is
important to maintain caution with
classification techniques and look for any
unbalanced datasets as they can mislead
when it comes to prediction. This way,
Nandeshwar et al. (2011) proposed various
improvements for prediction of student
dropout like exploring a huge range of
methods, attributes, and evaluating theory
effectiveness and studying factors among
non-dropout and dropout students.
Collaborative filtering approaches play a
vital role in recommendation, even though
they usually go hand in hand with other
techniques like social, content-based and
knowledge based (Bobadilla et al., 2013).
Just like decisions made by humans are
based on their knowledge and past
experience, collaborative filtering acts just
like the same for predictions. Some studies
predicted various issues regarding the
performance of students with collaborative
filtering. Majority of studies discussed in
this article use large data metrices for
performance prediction of students. Hence,
collaborative filtering was not that accurate
in prediction when used in small sample
sizes for the same purpose (Pero &
Horváth, 2015).
Conclusion
On the basis of data collected here,
supervised learning is the most common
technique for student behavioral prediction
as it provides reliable and accurate results.
The SVM model was used most widely by
the authors and has given most accurate
results. Along with SVM, decision tree,
naïve bayes, and random forests are widely
used for good results. Collaborative
filtering algorithms and recommender
systems are widely used in this arena. It is
worth noting that recommending activities
and resources are more successful than
prediction of student behavior. Neural
networks are not used widely but they
achieved great accuracy when it comes to
predict performance of students. This study
can help researchers and give them a lot of
possibilities to apply machine learning for
prediction of performance in e-learning and
other platforms.
References
Adewale, A. M., Bamidele, A. O., &
11. 11
Lateef, U. O. (2018). Predictive modelling
and analysis of academic performance of
secondary school students: Artificial
Neural Network approach. International
Journal of Science and Technology
Education Research, 9(1), 1-8.
Alpaydin, E. (2020). Introduction to
machine learning. MIT press.
Al-Shabandar, R., Hussain, A., Laws, A.,
Keight, R., Lunn, J., & Radi, N. (2017,
May). Machine learning approaches to
predict learning outcomes in Massive open
online courses. In 2017 International joint
conference on neural networks (IJCNN)
(pp. 713-720). IEEE.
Amrieh, E. A., Hamtini, T., & Aljarah, I.
(2016). Mining educational data to predict
student’s academic performance using
ensemble methods. International journal of
database theory and application, 9(8), 119-
136.
Arce, M. E., Crespo, B., & Míguez-
Álvarez, C. (2015). Higher Education
Drop-out in Spain-- Particular Case of
Universities in Galicia. International
Education Studies, 8(5), 247-264.
Arsad, P. M., & Buniyamin, N. (2013,
November). A neural network students'
performance prediction model (NNSPPM).
In 2013 IEEE International Conference on
Smart Instrumentation, Measurement and
Applications (ICSIMA) (pp. 1-5). IEEE.
Bobadilla, J., Ortega, F., Hernando, A., &
Gutiérrez, A. (2013). Recommender
systems survey. Knowledge-based systems,
46, 109-132.
Bydžovská, H. (2015). Student
performance prediction using
collaborative filtering methods. In
Artificial Intelligence in Education: 17th
International Conference, AIED 2015,
Madrid, Spain, June 22-26, 2015.
Proceedings 17 (pp. 550-553). Springer
International Publishing.
Cano, A., & Leonard, J. D. (2019).
Interpretable multiview early warning
system adapted to underrepresented student
populations. IEEE Transactions on
Learning Technologies, 12(2), 198-211.
Cortez, P., & Silva, A. M. G. (2008). Using
data mining to predict secondary school
student performance. Proceedings of 5th
Annual Future Business Technology
Conference, Porto, 5-12.
Dekker, G. W., Pechenizkiy, M., &
Vleeshouwers, J. M. (2009). Predicting
Students Drop Out: A Case Study.
International Working Group on
Educational Data Mining.
Dharmasaroja, P., & Kingkaew, N. (2016,
August). Application of artificial neural
networks for prediction of learning
performances. In 2016 12th International
Conference on Natural Computation,
Fuzzy Systems and Knowledge Discovery
(ICNC-FSKD) (pp. 745-751). IEEE.
El Aouifi, H., El Hajji, M., Es-Saady, Y.,
& Douzi, H. (2021). Predicting learner’s
performance through video sequences
viewing behavior analysis using
educational data-mining. Education and
Information Technologies, 26(5), 5799-
5814.
Gedeon, T. D., & Turner, H. S. (1993,
October). Explaining student grades
predicted by a neural network. In
Proceedings of 1993 International
Conference on Neural Networks (IJCNN-
93- Nagoya, Japan) (Vol. 1, pp. 609-612).
IEEE.
Giannakos, M. N., & Vlamos, P. (2013).
Educational webcasts' acceptance:
Empirical examination and the role of
experience. British Journal of Educational
Technology, 44(1), 125-143.
HarvardX. (2014). HarvardX Person-
Course Academic Year 2013 De-Identified
Dataset, Version3.0.
Huang, A. Y., Lu, O. H., Huang, J. C., Yin,
C. J., & Yang, S. J. (2020). Predicting
students’ academic performance by using
educational big data and learning analytics:
evaluation of classification methods and
learning logs. Interactive Learning
Environments, 28(2), 206-230.
Imran, A. S., Dalipi, F., & Kastrati, Z.
(2019, April). Predicting student dropout in
a MOOC: An evaluation of a deep neural
network model. In Proceedings of the 2019
5th International Conference on
12. 12
Computing and Artificial Intelligence (pp.
190-195).
Iyanda, A. R., Ninan, O. D., Ajayi, A. O., &
Anyabolu, O. G. (2018). Predicting Student
Academic Performance in Computer
Science Courses: A Comparison of
Neural Network Models. International
Journal of Modern Education & Computer
Science, 10(6).
Kalles, D., & Pierrakeas, C. (2006).
Analyzing student performance in distance
learning with genetic algorithms and
decision trees. Applied Artificial
Intelligence, 20(8), 655-674.
Kokoç, M., Akçapınar, G., & Hasnine, M.
N. (2021). Unfolding students’ online
assignment submission behavioral patterns
using temporal learning analytics.
Educational Technology & Society, 24(1),
223-235.
Kondo, N., Okubo, M., & Hatanaka, T.
(2017, July). Early detection of at-risk
students using machine learning based on
LMS log data. In 2017 6th IIAI
international congress on advanced
applied informatics (IIAI-AAI) (pp. 198-
201). IEEE.
Kórösi, G., Esztelecki, P., Farkas, R., &
Tóth, K. (2018). Clickstream-based
outcome prediction in short video MOOCs.
In 2018 International conference on
computer, information and
telecommunication systems (CITS) (pp. 1-
5). IEEE.
Kotsiantis, S. (2007). A review of
classification techniques, sb kotsiantis.
Informatica, 31, 249- 268.
Kotsiantis, S., Pierrakeas, C., & Pintelas, P.
(2004). PREDICTING
STUDENTS'PERFORMANCE IN
DISTANCE LEARNING USING
MACHINE LEARNING
TECHNIQUES. Applied Artificial
Intelligence, 18(5), 411-426.
Kumar, S., Mohri, M., & Talwalkar, A.
(2012). Sampling methods for the Nyström
method. The Journal of Machine Learning
Research, 13(1), 981-1006.
Larusson, J. A., & White, B. (Eds.). (2014).
Learning analytics: From research to
practice (Vol. 13). Springer.
Lemay, D. J., & Doleck, T. (2020). Grade
prediction of weekly assignments in
MOOCS: mining video-viewing behavior.
Education and Information Technologies,
25, 1333-1342.
Liang, K., Zhang, Y., He, Y., Zhou, Y.,
Tan, W., & Li, X. (2017). Online behavior
analysis-basedstudent profile for intelligent
E-learning. Journal of Electrical and
Computer Engineering, 2017. Liu, W., Wu,
J., Gao, X., & Feng, K. (2017, December).
An early warning model of student
achievement based on decision trees
algorithm. In 2017 IEEE 6th International
Conference on Teaching, Assessment, and
Learning for Engineering (TALE) (pp. 517-
222). IEEE.
Mitchell, T. M. (2007). Machine learning
(Vol. 1). New York: McGraw-hill.
Moseley, L. G., & Mead, D. M. (2008).
Predicting who will drop out of nursing
courses: a machine learning exercise. Nurse
education today, 28(4), 469-475.
Nandeshwar, A., Menzies, T., & Nelson,
A. (2011). Learning patterns of university
student retention. Expert Systems with
Applications, 38(12), 14984-14996.
Navamani, J. M. A., & Kannammal, A.
(2015). Predicting performance of schools
by applying data mining techniques on
public examination results. Research
Journal of Applied Sciences, Engineering
and Technology, 9(4), 262-271.
Pero, Š., & Horváth, T. (2015).
Comparison of collaborative-filtering
techniques for small-scale student
performance prediction task. In
Innovations and Advances in Computing,
Informatics, Systems Sciences, Networking
and Engineering (pp. 111-116). Springer
International Publishing. Powers, D. M.
(2020). Evaluation: from precision, recall
and F-measure to ROC, informedness,
markedness and correlation. arXiv preprint
arXiv:2010.16061.
Qu, S., Li, K., Wu, B., Zhang, X., & Zhu, K.
(2019). Predicting student performance and
deficiencyin mastering knowledge points in
13. 13
MOOCs using multi-task learning.
Entropy, 21(12), 1216.
-Guerrero, J. L., Gómez-Pulido, J. A., &
Durán-Domínguez, A. (2020). Analyzing
and predicting students’ performance by
means of machine learning: A review.
Applied sciences, 10(3), 1042.
Ruipérez-Valiente, J. A., Cobos, R.,
Muñoz-Merino, P. J., Andujar, Á., &
Delgado Kloos, C. (2017). Early
prediction and variable importance of
certificate accomplishment in a MOOC.
In Digital Education: Out to the World and
Back to the Campus: 5th European
MOOCs Stakeholders Summit, EMOOCs
2017, Madrid, Spain, May 22-26, 2017,
Proceedings 5 (pp. 263- 272). Springer
International Publishing.
Sugiyama, M. (2015). Introduction to
statistical machine learning. Morgan
Kaufmann.
Villagrá-Arnedo, C. J., Gallego-Durán, F. J.,
Compañ, P., Llorens Largo, F., & Molina-
Carmona,
R. (2016). Predicting academic performance
from behavioural and learning data.
Wan, H., Liu, K., Yu, Q., & Gao, X.
(2019). Pedagogical intervention practices:
Improving learning engagement based on
early prediction. IEEE Transactions on
Learning Technologies, 12(2), 278-289.
Wan, H., Liu, K., Yu, Q., & Gao, X.
(2019). Pedagogical intervention practices:
Improving learning engagement based on
early prediction. IEEE Transactions on
Learning Technologies, 12(2), 278-289.
Watt, J., Borhani, R., & Katsaggelos, A. K.
(2020). Machine learning refined:
Foundations, algorithms, and applications.
Cambridge University Press.
Witten, I. H., Frank, E., Hall, M. A., Pal, C.
J., & DATA, M. (2016, June). Practical
machine learning tools and techniques. In
Data Mining (Vol. 2, No. 4).
Xiao, B., Liang, M., & Ma, J. (2018,
October). The application of CART
algorithm in analyzing relationship of
MOOC learning behavior and grades. In
2018 International Conference on Sensor
Networks and Signal Processing (SNSP)
(pp. 250-254). IEEE.
Yadav, S. K., Bharadwaj, B., & Pal, S.
(2012). Mining Education data to predict
student's retention: a comparative study.
arXiv preprint arXiv:1203.2987.