Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

TEMJournalMay2022 497 505

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

TEM 

Journal. Volume 11, Issue 2, pages 497-505, ISSN 2217-8309, DOI: 10.18421/TEM112-01, May 2022. 

Data Analysis of Short - term and


Long - term Online Activities in LMS
Carmen Carrión

School of Computer Engineering, University of Castilla-La Mancha, 02071 Albacete, Spain

Abstract – Online teaching activities based on The spread of COVID-19 and its variants means
increasingly used computer-based educational systems that many educational centres have to switch to an
lacks standard rules for its implementation. This paper educational model in which part, if not all, of their
describes the design of online training activities using teaching is online. Thus, the teacher incorporates
Moodle as a Learning Management System (LMS) open educational resources or activities based on
and, evaluate short-term and long-term students’ Learning Management Systems (LMS) to
learning outcomes applying data mining techniques. complement face-to-face classes [2], [3], [4].
Clustering and classification algorithms are combined Massive Open Online Courses (MOOCs) even
to uncover valuable, non-obvious students’ patterns
appear as an extreme expression in which all
from a well-defined collection of data. Data results
from online quiz-based activities in a subject of
teaching is online [5]. However, no standard guides
Computer Science show that students who are not exist on how to implement an online activity in a
engaged in the training activity during the short-term course or grade, being always left to the best
learning process fail. Data analysis also shows that the judgment of the teacher in charge [6], [7].
number of trials is a key attribute. Hence, it is In any online educational activity, which is to be
important to develop user-friendly online activities developed, computer-based educational systems
with real-time feedback based on student behaviour. become a fundamental pillar. So, LMSs, such as
Moreover, according to our experiment, online Moodle, Edmodo or Canvas are becoming widely
training activities decrease in efficiency over time. used in higher education [4]. Nowadays, LMSs are
Keywords – higher education, Learning Management mainly used as a static information exchange tool [8].
System, Data Mining, students’ behaviour, students’ On the one hand, the teacher uploads information
outcomes. about the syllabus and slides of the course and, on
the other hand the students upload the results of the
1. Introduction laboratory practices and exercises sent as homework
[9]. But, LMSs often have configurable functional
The COVID-19 pandemic stresses the need for features, such as quizzes, puzzles or blogs that allow
innovative learning technologies based on online dynamic interaction and students’ collaboration to be
included in the online teaching-learning process. The
activities [1].
primary purpose for the use of these functional
features is to increase students’ motivation,
engagement and academic results. Notice that in
DOI: 10.18421/TEM112-01  online learning environments measuring the
https://doi.org/10.18421/TEM112-01  efficiency of the developed activities, as well as,
Corresponding author: Carmen Carrión,  adapting the teaching process according to the
School of Computer Engineering, University of Castilla‐La  feedback received from the students is not
Mancha 02071‐Albacete, SPAIN.   straightforward as it is the case in traditional face-to-
Email: carmen.carrion@uclm.es  face classes. Efficiency analysis of online activities is
only possible if sufficient information about the
Received:   27 January 2022.  individual characteristics of the students is obtained
Revised:     15 March 2022.  through interactions with the environment.
Accepted:   23 March 2022.  Researchers usually apply different methodologies to
Published:  27 May 2022.  the educational context in order to discover hidden
©  2022  Carmen  Carrión;  published  by  knowledge and patterns in the learning process [10].
UIKTEN.  This  work  is  licensed  under  the  Creative  Out of all, Data Mining (DM) techniques have
Commons  Attribution‐NonCommercial‐NoDerivs  4.0  become increasingly prominent for improving
License. teaching-learning process. They make use of
mathematical algorithms to extract non-obvious
The  article  is  published  with  Open  Access  at  behavioural patterns of interest from a large data set
https://www.temjournal.com/ [11].

TEM Journal – Volume 11 / Number  2 / 2022.     497 
TEM Journal. Volume 11, Issue 2, pages 497‐505, ISSN 2217‐8309, DOI: 10.18421/TEM112‐01, May 2022. 

In this paper, we emphasize on understanding the by the data analysis results in Section 6-a. Finally,
education process in its full complexity, leveraging Section 7 presents the conclusions and the future
teacher judgment, and applying DM techniques to work.
gain insight into the students’ learning process [12],
[13], [14]. This task is not easy, and online 2. Related Work
behaviour of students and their predicted
performance are in contrast between the studies Different DM techniques for finding hidden
carried out up to now [13], [16], [17]. knowledge and patterns in the learning process are
To tackle this issue, more research is required, and applied to educational environments [10]. Several
this paper takes a step forward by establishing a set systematic literature reviews have been conducted in
of objectively verifiable indicators for online education to describe the state-of-the-art in this area
teaching-learning activities and, examining measures [18], [19], [20]. Clustering and classification DM
related to short-term and long-term knowledge. techniques provide good results in predicting the
Hence, the major aim of this paper is to extract results [21].
valuable information about the achievement of online In [18], authors present over three decade’s
activities in the short-term and long-term in order to
systematic literature review on clustering algorithms
guide teachers in their development and to make a
and its applicability in EDM. The paper outlines that
continuous improvement in learning outcomes.
educational data are non-independent in nature and
Data-driven analysis collects student behaviour
during the completion of online activities via LMS clustering can provide a relatively unambiguous
and makes use of DM clustering and classification outline of student learning style as a function of
techniques. The paper presents a use case for several variables. In [19], authors show that most of
Computer Engineering, but the study can be applied the researching work in the area of predicting
to different educational environments, courses and students’ performance is looking at predicting
subjects. Specifically, the aim of this paper is to attainable academic metrics such as exam grade,
answer questions like these: course grade, program retention or dropout or
assignment performance [22], [23]. For example, in
 Is there a relationship between students’ online [23] authors use decision trees to apply classification
activity behaviours and their short-term and techniques and detect factors linked to academic
long-term outcomes? performance in large-scale assessments. Their results
 Can relevant information be extracted from the indicate that personal factors are the most indicative
online activity interaction data to understand its for academic performance, followed by school-
impact, guide the teacher, and eliminate, if any, related and social factors. Moreover, according to
activity-related learning deficiencies? [19], data of students’ activity is one of the least
Education is alive process, and the proposed data- explored to predict students’ performance, that is the
drive analysis of the online activities will support the focus of this paper.
teacher with the feedback needed to succeed in the The effect of students’ online interaction with
teaching-learning process. To sum up, the key Moodle and their relationship with achievement is
contributions of this paper are: examined in [15]. The variables considered in [15]
are time task, time theory, time forums, word forums,
 The proposal of a generic and reproducible relevant actions and procrastination. Additionally,
online teaching-learning activity is based on final marks were extracted from the performance of
opensource tools. the subject. Using k-means clustering techniques
 The definition of a set of quantitative parameters they found that more activity in the LMS does not
is useful for data analysing online activities. assure better results. Moreover, they got that the
 The use of DM as a tool is to discover significant students who hand in the task later were more likely
patterns to model students’ trends both in the to receive a lower score, that is, the more
short-term and long-term learning process. procrastination, the worse the performance.
 The analysis and evaluation are of a practical use In [16] significant indicators from the LMS data
in computer science. such as regular study, total viewing time, sessions,
The paper is organized as follows. In Section 2, we late submissions, proof of reading the course
present the results obtained and techniques employed information packets, and messages created were
by the related work. Details of our life cycle for chosen as predictors. The results revealed that regular
designing and analysing online activities are study was the strongest predictor of course
discussed in Section 3. Then, Sections 4 and 5 detail achievement, followed by late submissions, sessions,
the data parameters and data mining analysis applied and proof of reading the course information packets.
in our approach, respectively. After that, a use case in However, students’ total viewing time and messages
Computer Science is presented in Section 6, followed created were not significant.

498                                                                                                                          TEM Journal – Volume 11 / Number 2 / 2022. 
TEM Journal. Volume 11, Issue 2, pages 497‐505, ISSN 2217‐8309, DOI: 10.18421/TEM112‐01, May 2022. 

In [13], authors use time spent for accessing online  Training and Practice step: Online training
learning materials (video, slides, etc.) to classify activities are web-based activities accessible via the
students into different clusters. They found that Internet that are available 24/7 up to a prefixed day
behaviours of accessing online learning materials on the digital platform. So, students can access the
were associated with learning performance. More online training activity whenever they want, use the
precisely, the students who invested more time and time they need and repeat the online training activity
effort in viewing the online learning materials had as many times as they wish. As a counterpart to face-
better learning performance. to-face training activities lack the direct support of
In [17], authors use k-mean clustering algorithm to
the teacher, where the teacher provides the
examine if students’ interaction with different online
appropriate coaching and supervision while the
learning activities affects students’ learning
performances, motivation, and self-regulated learning activity is running in the classroom. Hence, it is
strategies. Their results conclude that the students essential to give specific and easy-to-understand
who spend more time in the learning activities got instructions on how to carry out the activity.
higher academic outcomes.  Validation step: After the training step the
Nevertheless, at this point we have to highlight students should have acquired new skills and
that the conclusions drawn in [15], [16] contradict knowledge. Therefore, the validation step allows
the abovementioned results in [13], [17]. These facts them to check their improvement. At this point, the
may be due to the analysis were done under different student receives the first quantifiable performance
contexts, type of courses and students backgrounds indicators about his or her progress. This step can be
[24], [17]. In any case, these facts underline the need an important turning point for students, being a very
for further studies to establish some baselines and stimulating phase in the learning process or just the
avoid the contradictions, as well as, to set up a set of opposite. The parameters collected in this step are
objectively verifiable indicators for the teaching- related to short-term learning. In this paper, this term
learning process. For this reason, this paper presents will be defined as the learning process over the
the results obtained for a use case using a generic course of a single session or during the training
methodology based on the analysis of online activity period of a single online activity.
data for the short-term and long-term.  Final step: This is the last step of the in-class
role and provides information about the long-term
3. Development and Evaluation of Short-term learning process. Data collection includes both
and Long-term Online Activities cognitive and affective outcomes.
The methodology applied in this paper to model The students will make theoretical questionnaires
and analyze the effectiveness of online activities and practices exercises at the end of the course for
using Moodle LMS includes a complete life cycle measuring the cognitive outcome and will complete
that combines two main roles: the in-class and the post-test questionnaires for affective outcomes.
off-stage. Figure 1 details the steps involved in the The off-stage role is characterized by the fact that
development of the online activities. Next, we will the student does not actively participate in it,
explain all the involved steps in detail. although it is crucial for the achievement of the
objectives for which the online activity is carried out.
The off-stage role includes the design and
implementation, the analysis and the feedback step.
 Design and implementation step: Learning
activities are created and designed to achieve
learning objectives. So, in this step the teacher
addresses the challenges of analyzing the capacity
and feasibility of the online activities to work on the
learning outcomes: the complexity, the context and
the timeline of the activity have to be taken into
account. It is compulsory to examine carefully the
previous student’s understanding on the topic, as well
as, the activity complexity to reduce the negative
Figure 1. Proposed methodology to develop online outcome of poor performance. Then, a well-
activities structured support has to be developed before starting
The in-class role comprises those stages in which the learning activity. Both the required theoretical
the student participates actively even though they are knowledge and a detailed description of how to do
conducted online. More precisely, the in-class role the activity have to be provided to the student in
includes the training, the validation, and the final order to successfully achieve the goals of the activity.
step. Moreover, an updated scheduling timetable has to be

TEM Journal – Volume 11 / Number  2 / 2022.                                                                                                                          499 
TEM Journal. Volume 11, Issue 2, pages 497‐505, ISSN 2217‐8309, DOI: 10.18421/TEM112‐01, May 2022. 

presented to the students in order to know in which 4. Collecting Parameters for Data Mining 
sessions the activity will be running and also some
important deadlines. For example, if the activity Improving the efficiency of teaching activities
requires the production of a deliverable, the tackles the definition of suitable metrics. It is
deliverable have to always be uploaded to the LMS necessary to collect some quantifiable parameters to
on time. Setting deadlines for activities in advance is measure both the students’ behavior and performance
especially important if they involve extra work that denote the level of success. Examples of these
outside of class time. To sum up, the design and parameters may be the percentage of tasks completed
implementation step is a key phase in the off-stage or the score obtained in a test. Specifically, the
role that includes: parameters employed in this research have been
classified considering their target end-goal as
 The teacher’s training period in which the
cognitive and behavioral. Cognitive parameters are
teacher searches for new teaching approaches and
directly related to academic scores or grades and,
learns the details of using new digital tools.
behavioral parameters are related to the student’s
 The design period in which the teacher plans
commitment and attitude during the course. Among
the activities considering the learning objectives and
the parameters used to analyze the student behavior
the time available for the activity. The teacher also
we have collected:
has to design the observable parameters for further
The Trial (T) is measured as the number of times
analysis. A detailed description will be provided in
as student completes an activity. Note that some
subsection 4.
online quizzes can be done as many times as students
 Finally, in the get-ready step the teacher
need but some just-in-time teaching activities can be
prepares all the materials. This step may involve for
done only once.
example the access to a digital platform, prepare
The Reinforcement (Re) computes the number of
some hints and prizes and/or design a multiple-
trials done by the student after having obtained the
choice test.
maximum score in the activity.
 Analysis step: Both a statistical exploration
The Readiness (R) is a measure of the student’s
of data and DM techniques can be used to discover
willingness to do. More precisely, in this work
useful information which will allow us to improve
readiness has been calculated as the time the student
the scheduling and the design of the activity. The
takes the task in-advance of the delivery time.
first steps of the DM analysis include data collection
The learning Velocity (V) is measured as the
and pre-processing. In some cases, data collection for
number of trials done by the student to obtain the
students requires knowledge of applicable personal
maximum mark. The parameter reflects in some way
privacy laws and regulations. In the proposed data
the learning capacity of the student. It can be a useful
analysis, just after the pre-processing, a clustering
metric to give personal assistance to the student.
technique is combined with classification techniques
The Cost (C) is measured as the time employed for
to observe students’ patterns and accurately predict
each student to complete the learning activity.
important features. In short, data is transformed into
The Participation (P) is a global metric that
useful information for decision-making. More details
measures the total number of students that participate
of the proposed data analysis are provided in Section
in a teaching-learning activity.
5.
To measure the cognitive outcomes, we have
 Feedback step: Based on the reports obtained
collected:
in the previous step, we will detect and even
The Initial Score (IS) provides a quantitative
anticipate possible drifts in the teaching-learning
measure of the student’s initial level of knowledge on
process. It is about predicting deviations and non-
the topic. A high score denotes a high probability that
effective performance in the classroom, as well as
the student already had the targeted skills before
decreasing the dropout rate. The effectiveness quality
starting the activity while a low value indicates the
of the activity is improved by applying successive
opposite.
refinements that consider the feedback obtained from
The Recent Score (RS) measures the short-term
the data analysis. The key idea is to adapt the
knowledge acquired by the student. RS takes into
teaching-learning activity solving the detected
account what students know at the end of a training
problems and reinforcing the mechanisms that do
and practice step without considering the mistakes
work. It is important to note that the feedback step
made during that learning process. This metric
may not be the last one of the season. Data analysis
should be collected as soon as the practice step
can trigger a reactive mechanism on a continuous
concludes.
basis while the activity is in progress. For example, it
The Final Score (FS) computes the long-term
would be possible to collect and analyze information
knowledge. The key idea is to tackle the final
during the training step.
activity, probably at the end of the course, to measure

500                                                                                                                          TEM Journal – Volume 11 / Number 2 / 2022. 
TEM Journal. Volume 11, Issue 2, pages 497‐505, ISSN 2217‐8309, DOI: 10.18421/TEM112‐01, May 2022. 

the knowledge acquired by the students thanks to the are also removed and, all data are filtered to consider
developed activity. only those online activities that have been
The Effectiveness (E) calculates the number of successfully completed. In addition, at this stage, the
students that did not possess the targeted knowledge students’ first and last name is converted into a
before the activity but ended up successfully at the numeric fingerprint that uniquely and anonymously
end. identifies each student.
After that, using a computer spreadsheet
5. DM in the Off-stage Role  application, simple calculations were performed to
obtain the quantitative values of the metrics defined
In this section, we outline the steps to be carried in the Section 4. In a course, as many short-term
out to extract value from the data obtained from the online activities (Ai) can be performed as many
students' interaction with the Moodle activities. We learning objectives are defined (N). Therefore, for
want to verify whether the online activities have been each metric defined in Section 4 we will obtain a
designed successfully and therefore have boosted the vector of N values. In our analysis, we have
teaching-learning process. In general, the learning computed a global value for each parameter to avoid
data mining process consists of the following steps: non-significant fluctuations. In general, the global
collection, pre-processing, mining, and evaluation. value of a parameter (labeled as K) is computed as a
Figure 2 outlines the main steps of our approach. weighted (Wi) average of each individual activity
(Ai) according to Eq. 1:

(1)

Lastly, in the pre-processing step the numerical


values are transformed into discrete attributes
following unsupervised methods. These methods
Figure 2. Workflow for data analyzing the online allow the discretization of continuous values into
activities categorical classes following different criteria (i.e.
equal width, frequency). In our case the parameters
Moodle provides us with information about the defined to analyze the behavior of the students have
materials accessed by the students and records all the been discretized following the equal width method
clicks made by the students when they navigate with four intervals. However, for the discretization of
through the different resources. This data constitutes the cognitive values a manual criterion has been
the student’s digital footprint in Moodle and can be defined. It has four intervals and the labels Fail, Pass,
exported in different formats to start processing. We Good and Excellent. Thus, for the cognitive metrics,
have specifically exported the information provided a value greater than or equal to 9 will be classified as
by Moodle 3.5v Institutional [4]. By processing these Excellent and a value less than 5 will be classified as
digital traces we will get the parameters indicated in Fail. In addition, the label Pass will be assigned for
Section 4. values greater than or equal to 50 but less than 70
More precisely, from all the available data, we and, label Good for values greater than or equal to 70
have extracted the data collected by the Moodle quiz but less than 90.
module for each online activity and the final grades After preprocessing, clustering and classification
obtained by the students. It should be noted that the techniques are performed. Our approach follows the
quiz module collects information about the student’s technique detailed in [25]. In general, a clustering
name and the way of accessing the platform. It also algorithm will group students with similar properties.
includes the access and completion times of the quiz Thus, in this study we will make use of two
and the score obtained. These data are recorded each clustering algorithms, one manual and one automatic.
time the student takes the activity and in our case The manual clustering groups students according to
there may be several trials. The analysis of the data the students’ final marks while the automatic
will allow us to make a short-term cognitive analysis clustering uses a clustering algorithm based on all the
while the final results will allow us to analyze the behavioral data. In this case we will use the
long-term cognitive results. Moodle provides this Expectation-Maximization (EM) clustering algorithm
data in text files with csv format. that will group students with similar characteristics
In the data pre-processing stage, the data are without the need to indicate the number of clusters.
purged. First of all, duplicate attributes and More precisely, the EM algorithm finds maximum
unnecessary attributes such as e-mail or access IP likelihood estimators of parameters in probabilistic
address are removed. Data associated with teachers models that rely on unobservable variables.

TEM Journal – Volume 11 / Number  2 / 2022.                                                                                                                          501 
TEM Journal. Volume 11, Issue 2, pages 497‐505, ISSN 2217‐8309, DOI: 10.18421/TEM112‐01, May 2022. 

Then, for the purpose of observing students' work has been called the long-term results of the
patterns we apply some classification techniques. online activities. All these students’ interactions with
The idea is to use the criterion variable (RS or FS) as the online activities are tracked by Moodle and allow
a reference for the algorithm to split the input data set us to extract value from the data through data
into mutually exclusive subgroups. The aim of using analytic, as shown in the next section.
the classification algorithm is to model and predict
IR and IF for each case of the input dataset. Note that Table 1. Centroids of each Cluster (mean,std.dev)
RS is the criterion variable for the short-term
teaching-learning process while FS is used for the
long-term process. The better the students are ranked
on the criterion variable, the higher the validity of the
model. Individual well-known classification
techniques, such as C4.5, LMT, RandomTree or
HoeffdingTree should be explored to increase the a. Data analysis results 
final accuracy and precision of the system.  

The results shown in this section have been


6. On-line Activities Carried out in the Subject obtained using Weka [27]. First, we have applied the
of Computer Architecture expectation-maximization (EM) clustering algorithm.
This algorithm sorts the students into three groups
The online activities outlined in Section 3 have that we will call Cluster0, Cluster1 and Cluster2.
been implemented this academic year in the third Each cluster includes a different number of students
year subject of Computer Engineering called but with similar behavior. Table 1 shows this
Computer Architecture at the University of Castilla information and both the centroids and standard
La Mancha (UCLM). The university provides an deviation of the behavioral parameters are described
institutional Moodle, currently in its version 3.5v in Section 4 for each cluster. Note that the centroid
[26]. value does not have to represent the behavior of any
In particular, eight short-term online activities student but describes the most typical case within the
were scheduled, one for each learning objective or cluster. Students in Cluster0 have the lowest iteration
subject topic. Participation in these online activities values with online activity. In Cluster0, the average
was not compulsory for students, but participation number of times the activity is performed in the
was encouraged using credits. So, students involved training period, T, is less than 1. Moreover, in this
in the online activities earn credits that are cluster the short time spent performing the activity (C
redeemable for a percentage of the final course parameter) is striking. On the other hand, it can be
grade, in our case, up to 10% of the final grade of the shown that the students composing Cluster2 are quite
course. It should be noted that all students enrolled in active in the training period. In Cluster2, students
the course participated in the online activities. make use of the training activity a high number of
In the design and implementation step of the off- times (T=8) and also dedicate more time to the
stage role, a bank of multiple-choice questions training process (high values in C and R). In
classified by topic is generated. In our case we have Cluster1, students also show active participation in
8 topics and 542 questions. Then, the Moodle quiz online activities, T=3, although with lower values
module is used to generate the eight online activities than Cluster2.
and schedule them appropriately over time First, we will analyze if students’ behavior
throughout the term. Each of these quizzes is associated with Cluster0, Cluster1 and Cluster2
available for the Training&Practice step and the affects students’ cognitive outcomes. Figures 3-a and
student can repeat the quiz as many times as he/she 3-b show the distribution of students in each cluster
wishes. A quiz will consist of 10 multiple-choice according to the cognitive values RS and RF,
questions randomly selected from the question bank respectively. Figure 3-a shows the results of the
within the topic to be studied. Once the training short-term learning process, and it can be observed
period is over, the validation stage takes place and that students in Cluster2 obtain the best results. More
the students perform the final test of the short-term than 85% obtain an excellent result. Furthermore,
online activity. only students from Cluster0 fail the activity. The
In addition, a final global questionnaire is carried higher the participation in the training, the better the
out at the end of the course. This questionnaire is academic results. These results highlight the
made up of 30 randomly selected questions and effectiveness of the online activities in the short-
covers all the topics of the course. The data obtained term.
from this questionnaire constitute the final step of the
in-class role and their analysis yields what in this

502                                                                                                                          TEM Journal – Volume 11 / Number 2 / 2022. 
TEM Journal. Volume 11, Issue 2, pages 497‐505, ISSN 2217‐8309, DOI: 10.18421/TEM112‐01, May 2022. 

and also the dataset that includes all the students,


hereinafter called as ClusterAll.
1) Short-term teaching-learning process DM
analysis: First, we execute the J48 classification
algorithm using the behavioral attributes and
consider RS as the objective attribute with four
nominal values (Fail, Pass, Good and Excellent)
Table 2 summarizes the main characteristics
(precision, number of nodes and leaves) when the
J48 classification algorithm is applied, and RS is the
Figure 3-a. Cognitive distribution of students according objective attribute to predict. The EM clustering
to the EM algorithm (RS distribution) algorithm has an efficient effect on the classification
algorithm reducing the complexity of the decision
tree.
Table 2. Output performance of the J48 algorithm for RS
as classification attribute

Figure 3-b. Cognitive distribution of students according Figure 4 shows the decision trees obtained for
to the EM algorithm (RF distribution) ClusterAll and Cluster0. Results reflect that R, T and
their standard deviation are the key parameters. A
In Figure 3-b the FS distribution, deeply linked to high value of R reports an Excellent outcome (see
the long-term teaching-learning process, is shown. In Figure 4-a). An important finding is that T is a key
this case, the high percentage of students who do not attribute. All the Fail outcomes are associated to
pass the test stands out. So, results show that over students who were not involved in the training step.
time, learners have forgotten what they learnt in the Another finding is that Re does not appear in the
online training phase. However, students in Cluster2 classification algorithm for any dataset. Moreover, C
are less likely to fail. At this point it should be is not a decisive attribute and, beyond any logical
mentioned that the complexity of the final test is reasoning the classification algorithm shows that
greater than those taken in the short-term learning students with less cost value get better outcomes.
process, as it covers the content of all the topics of
the subject. Although this aspect may affect the
results obtained, the impact of online learning
activities on long-term academic results is limited.
To continue the search of valuable information,
data are analyzed applying decision trees. The results
of this analysis identify which behavioral parameter
has the greatest impact on academic results. Thus, the
identification of inappropriate behavior will allow us
to take corrective educational actions and improve
the quality of teaching. This analysis aims to identify
the behavioral parameters that have the greatest
impact on academic outcomes to be able to apply
corrective measures in future implementations and
improve the quality of online activities. This process
comprises two parts: analyzing the short-term
teaching-learning process which takes into account
RS and the long-term learning process directly
related to the FS attribute. Four datasets are
evaluated: the three datasets obtained from EM
Figure 4-a. Classification of students for the short-term
algorithm called as Cluster0, Cluster1 and Cluster2
learning process using the J48 algorithm (ClusterAll)

TEM Journal – Volume 11 / Number  2 / 2022.                                                                                                                          503 
TEM Journal. Volume 11, Issue 2, pages 497‐505, ISSN 2217‐8309, DOI: 10.18421/TEM112‐01, May 2022. 

In Cluster0, students obtain the highest scores in


FS only if they have a high score in RS and
participate actively in the training step (high value in
T). It is remarkable that students with RS=Excellent
but low participation in the training step fail the final
step. Data show that most of them are students that
enroll again in the course. To sum up, students of
Cluster0 achieve good final scores, FS, if they are
involved in the training step and attain good results
in the short-term learning process. The decision tree
obtained for Cluster1 students shows that R, T and
the standard deviation of T and Re are the attributes
used in its internal nodes to classify the target
nominal values of FS (see Figure 5-b). Moreover,
Figure 4-b. Classification of students for the short-term Cluster2 students are classified according to T, C and
learning process using the J48 algorithm (Cluster0) the standard deviation of T and Re (see Figure 5-c).
Note that Cluster2 is the most active group on the
2) Long-term teaching-learning process DM online activity. This group fails with a high standard
analysis: In this case, the FS parameter will be used deviation of T and reaches the best score with a low
to discover the long-term learning process model, as standard deviation in T and high T.
well as to figure out how to improve the process from To sum up, clustering techniques simplify the
the student’s point of view. complexity of the models enabling better data
Table 3 summarizes the results of applying the J48 analysis but, long-term outcomes presents different
algorithm to predict $RF$ with the different datasets behavioral patterns being not only related to
and Figure 5 shows the decision trees obtained. academic short-term outcomes.

Table 3. Output performance of the J48 algorithm for RF 7. Conclusions and Future Work
as classification attribute
Tracking and analyzing students’ activity is critical
in online activities. In this work, behavioral and
cognitive parameters of online activities have been
defined. And techniques based on DM have
uncovered significant patterns in students’ behavior.
These patterns are key to recognizing trends among
students’ work and improving their learning
outcomes. The data analysis shows that integrating
online training activities into the learning process
improves students’ results, especially in the short
term. Moreover, the parameter that most influences
the cognitive outcomes is the number of times a
training activity is completed. It is important to
highlight that the methodology has been clearly
exposed, and open-source tools has been used to
simplify the comparison and replication of similar
experiences for future improvements. The datasets
extracted from our online activities using Moodle
quizzes are available to the scientific community.
We believe that online training activities could be
improved by integrating a recommendation
mechanism generated from the behavioral metrics of
each student and the results obtained in this work.
Therefore, we would like to extend the online
activities with a real-time notification system that
guides students in the in-class role.

Figure 5. Classification of students for the log-term


learning process using the J48 algorithm

504                                                                                                                          TEM Journal – Volume 11 / Number 2 / 2022. 
TEM Journal. Volume 11, Issue 2, pages 497‐505, ISSN 2217‐8309, DOI: 10.18421/TEM112‐01, May 2022. 

References [15]. Cerezo, R., Sánchez-Santillán, M., Paule-Ruiz, M.


P., & Núñez, J. C. (2016). Students' LMS interaction
[1]. Mishra, L., Gupta, T., & Shree, A. (2020). Online patterns and their relationship with achievement: A
teaching-learning in higher education during case study in higher education. Computers &
lockdown period of COVID-19 Education, 96, 42-54.
pandemic. International Journal of Educational [16]. You, J. W. (2016). Identifying significant indicators
Research Open, 1, 100012. using LMS data to predict course achievement in
[2]. Kahoot (2022). Kahoot, learning games, make online learning. The Internet and Higher
learning awesome! Retrieved from: Education, 29, 23-30.
https://kahoot.com/. [accessed: 23 December 2022]. [17]. Çebi, A., & Güyer, T. (2020). Students’ interaction
[3]. Socrative (2022). Meet Socrative: Your classroom app patterns in different online learning activities and their
for fun, effective engagement and on-the-fly relationship with motivation, self-regulated learning
assessments. Retrieved from: strategy and learning performance. Education and
https://www.socrative.com/ Information Technologies, 25(5), 3975-3993.
[accessed: 11 January 2022]. [18]. Dutt, A., Ismail, M. A., & Herawan, T. (2017). A
[4]. Moodle (2022). Moodle: Community driven, globally systematic review on educational data mining. Ieee
supported. Retrieved from: https://moodle.org/, Access, 5, 15991-16005.
[accessed: 11 January 2022]. [19]. Hellas, A., Ihantola, P., Petersen, A., Ajanovski, V.
[5]. Martin Nunez, J. L., Tovar Caro, E., & Hilera V., Gutica, M., Hynninen, T., ... & Liao, S. N. (2018,
Gonzalez, J. R. (2017). From Higher Education to July). Predicting academic performance: a systematic
Open Education: Challenges in the Transformation of literature review. In Proceedings companion of the
an Online Traditional Course. IEEE Transactions on 23rd annual ACM conference on innovation and
Education, 60(2), 134-142. technology in computer science education (pp. 175-
[6]. Kangas, M., Koskinen, A., & Krokfors, L. (2017). A 199).
qualitative literature review of educational games in [20]. Antonaci, A., Klemke, R., & Specht, M. (2019,
the classroom: the teacher’s pedagogical September). The effects of gamification in online
activities. Teachers and Teaching, 23(4), 451-470. learning environments: A systematic literature review.
[7]. Martínez-Ortiz, I., Pérez-Colado, I., Rotaru, D. C., In Informatics (Vol. 6, No. 3, p. 32). Multidisciplinary
Freire, M., & Fernández-Manjón, B. (2019, April). Digital Publishing Institute.
From heterogeneous activities to unified analytics [21]. Yang, Y., Hooshyar, D., Pedaste, M., Wang, M.,
dashboards. In 2019 IEEE global engineering Huang, Y. M., & Lim, H. (2020). Prediction of
education conference (EDUCON) (pp. 1108-1113). students’ procrastination behaviour through their
IEEE. submission behavioural pattern in online
[8]. Yassine, S., Kadry, S., & Sicilia, M. A. (2016, April). learning. Journal of Ambient Intelligence and
A framework for learning analytics in moodle for Humanized Computing, 1-18.
assessing course outcomes. In 2016 ieee global [22]. Baker, R. S., Lindrum, D., Lindrum, M. J., &
engineering education conference (educon) (pp. 261- Perkowski, D. (2015). Analyzing Early At-Risk
266). IEEE. Factors in Higher Education E-Learning
[9]. Brooks, D. C., & Pomerantz, J. (2017). ECAR Study Courses. International Educational Data Mining
of Undergraduate Students and Information Society.
Technology, 2017. EDUCAUSE. [23]. Martinez Abad, F., & Chaparro Caso López, A. A.
[10]. Ifenthaler, D., & Yau, J. Y. K. (2020). Utilising (2017). Data-mining techniques in detecting factors
learning analytics to support study success in higher linked to academic achievement. School Effectiveness
education: a systematic review. Educational and School Improvement, 28(1), 39-55.
Technology Research and Development, 68(4), 1961- [24]. Recker, M., & Lee, J. E. (2016). Analyzing learner
1990. and instructor interactions within learning
[11]. Zytkow, J. M., & Klösgen, W. (2002). management systems: Approaches and
Multidisciplinary contributions to knowledge examples. Learning, Design, and Technology, 1-23.
discovery. In Handbook of data mining and [25]. Romero, C., Cerezo, R., Bogarín, A., & Sánchez-
knowledge discovery (pp. 22-32). Santillán, M. (2016). Educational process mining: A
[12]. Clow, D. (2013). An overview of learning tutorial and case study using moodle data sets. Data
analytics. Teaching in Higher Education, 18(6), 683- mining and learning analytics: Applications in
695. educational research, 1.
[13]. Li, L. Y., & Tsai, C. C. (2017). Accessing online [26]. Univ. Castilla-La Mancha, (2022). Campus virtual
learning material: Quantitative behavior patterns and University of Castilla-la Mancha. Retrieved from:
their effects on motivation and learning https://www.uclm.es/areas/areatic/servicios/docencia/
performance. Computers & Education, 114, 286-297. campusvirtual, [accessed: 15 January 2022].
[14]. Alonso-Fernández, C., Cano, A. R., Calvo-Morata, [27]. WEKA (2022). Weka: The workbench for machine
A., Freire, M., Martínez-Ortiz, I., & Fernández- learning. Retrieved from:
Manjón, B. (2019). Lessons learned applying learning https://www.cs.waikato.ac.nz/ml/weka/,
analytics to assess serious games. Computers in [accessed: 16 January 2022].
Human Behavior, 99, 301-309.

TEM Journal – Volume 11 / Number  2 / 2022.                                                                                                                          505 

You might also like