TEMJournalMay2022 497 505
TEMJournalMay2022 497 505
TEMJournalMay2022 497 505
Journal. Volume 11, Issue 2, pages 497-505, ISSN 2217-8309, DOI: 10.18421/TEM112-01, May 2022.
Abstract – Online teaching activities based on The spread of COVID-19 and its variants means
increasingly used computer-based educational systems that many educational centres have to switch to an
lacks standard rules for its implementation. This paper educational model in which part, if not all, of their
describes the design of online training activities using teaching is online. Thus, the teacher incorporates
Moodle as a Learning Management System (LMS) open educational resources or activities based on
and, evaluate short-term and long-term students’ Learning Management Systems (LMS) to
learning outcomes applying data mining techniques. complement face-to-face classes [2], [3], [4].
Clustering and classification algorithms are combined Massive Open Online Courses (MOOCs) even
to uncover valuable, non-obvious students’ patterns
appear as an extreme expression in which all
from a well-defined collection of data. Data results
from online quiz-based activities in a subject of
teaching is online [5]. However, no standard guides
Computer Science show that students who are not exist on how to implement an online activity in a
engaged in the training activity during the short-term course or grade, being always left to the best
learning process fail. Data analysis also shows that the judgment of the teacher in charge [6], [7].
number of trials is a key attribute. Hence, it is In any online educational activity, which is to be
important to develop user-friendly online activities developed, computer-based educational systems
with real-time feedback based on student behaviour. become a fundamental pillar. So, LMSs, such as
Moreover, according to our experiment, online Moodle, Edmodo or Canvas are becoming widely
training activities decrease in efficiency over time. used in higher education [4]. Nowadays, LMSs are
Keywords – higher education, Learning Management mainly used as a static information exchange tool [8].
System, Data Mining, students’ behaviour, students’ On the one hand, the teacher uploads information
outcomes. about the syllabus and slides of the course and, on
the other hand the students upload the results of the
1. Introduction laboratory practices and exercises sent as homework
[9]. But, LMSs often have configurable functional
The COVID-19 pandemic stresses the need for features, such as quizzes, puzzles or blogs that allow
innovative learning technologies based on online dynamic interaction and students’ collaboration to be
included in the online teaching-learning process. The
activities [1].
primary purpose for the use of these functional
features is to increase students’ motivation,
engagement and academic results. Notice that in
DOI: 10.18421/TEM112-01 online learning environments measuring the
https://doi.org/10.18421/TEM112-01 efficiency of the developed activities, as well as,
Corresponding author: Carmen Carrión, adapting the teaching process according to the
School of Computer Engineering, University of Castilla‐La feedback received from the students is not
Mancha 02071‐Albacete, SPAIN. straightforward as it is the case in traditional face-to-
Email: carmen.carrion@uclm.es face classes. Efficiency analysis of online activities is
only possible if sufficient information about the
Received: 27 January 2022. individual characteristics of the students is obtained
Revised: 15 March 2022. through interactions with the environment.
Accepted: 23 March 2022. Researchers usually apply different methodologies to
Published: 27 May 2022. the educational context in order to discover hidden
© 2022 Carmen Carrión; published by knowledge and patterns in the learning process [10].
UIKTEN. This work is licensed under the Creative Out of all, Data Mining (DM) techniques have
Commons Attribution‐NonCommercial‐NoDerivs 4.0 become increasingly prominent for improving
License. teaching-learning process. They make use of
mathematical algorithms to extract non-obvious
The article is published with Open Access at behavioural patterns of interest from a large data set
https://www.temjournal.com/ [11].
TEM Journal – Volume 11 / Number 2 / 2022. 497
TEM Journal. Volume 11, Issue 2, pages 497‐505, ISSN 2217‐8309, DOI: 10.18421/TEM112‐01, May 2022.
In this paper, we emphasize on understanding the by the data analysis results in Section 6-a. Finally,
education process in its full complexity, leveraging Section 7 presents the conclusions and the future
teacher judgment, and applying DM techniques to work.
gain insight into the students’ learning process [12],
[13], [14]. This task is not easy, and online 2. Related Work
behaviour of students and their predicted
performance are in contrast between the studies Different DM techniques for finding hidden
carried out up to now [13], [16], [17]. knowledge and patterns in the learning process are
To tackle this issue, more research is required, and applied to educational environments [10]. Several
this paper takes a step forward by establishing a set systematic literature reviews have been conducted in
of objectively verifiable indicators for online education to describe the state-of-the-art in this area
teaching-learning activities and, examining measures [18], [19], [20]. Clustering and classification DM
related to short-term and long-term knowledge. techniques provide good results in predicting the
Hence, the major aim of this paper is to extract results [21].
valuable information about the achievement of online In [18], authors present over three decade’s
activities in the short-term and long-term in order to
systematic literature review on clustering algorithms
guide teachers in their development and to make a
and its applicability in EDM. The paper outlines that
continuous improvement in learning outcomes.
educational data are non-independent in nature and
Data-driven analysis collects student behaviour
during the completion of online activities via LMS clustering can provide a relatively unambiguous
and makes use of DM clustering and classification outline of student learning style as a function of
techniques. The paper presents a use case for several variables. In [19], authors show that most of
Computer Engineering, but the study can be applied the researching work in the area of predicting
to different educational environments, courses and students’ performance is looking at predicting
subjects. Specifically, the aim of this paper is to attainable academic metrics such as exam grade,
answer questions like these: course grade, program retention or dropout or
assignment performance [22], [23]. For example, in
Is there a relationship between students’ online [23] authors use decision trees to apply classification
activity behaviours and their short-term and techniques and detect factors linked to academic
long-term outcomes? performance in large-scale assessments. Their results
Can relevant information be extracted from the indicate that personal factors are the most indicative
online activity interaction data to understand its for academic performance, followed by school-
impact, guide the teacher, and eliminate, if any, related and social factors. Moreover, according to
activity-related learning deficiencies? [19], data of students’ activity is one of the least
Education is alive process, and the proposed data- explored to predict students’ performance, that is the
drive analysis of the online activities will support the focus of this paper.
teacher with the feedback needed to succeed in the The effect of students’ online interaction with
teaching-learning process. To sum up, the key Moodle and their relationship with achievement is
contributions of this paper are: examined in [15]. The variables considered in [15]
are time task, time theory, time forums, word forums,
The proposal of a generic and reproducible relevant actions and procrastination. Additionally,
online teaching-learning activity is based on final marks were extracted from the performance of
opensource tools. the subject. Using k-means clustering techniques
The definition of a set of quantitative parameters they found that more activity in the LMS does not
is useful for data analysing online activities. assure better results. Moreover, they got that the
The use of DM as a tool is to discover significant students who hand in the task later were more likely
patterns to model students’ trends both in the to receive a lower score, that is, the more
short-term and long-term learning process. procrastination, the worse the performance.
The analysis and evaluation are of a practical use In [16] significant indicators from the LMS data
in computer science. such as regular study, total viewing time, sessions,
The paper is organized as follows. In Section 2, we late submissions, proof of reading the course
present the results obtained and techniques employed information packets, and messages created were
by the related work. Details of our life cycle for chosen as predictors. The results revealed that regular
designing and analysing online activities are study was the strongest predictor of course
discussed in Section 3. Then, Sections 4 and 5 detail achievement, followed by late submissions, sessions,
the data parameters and data mining analysis applied and proof of reading the course information packets.
in our approach, respectively. After that, a use case in However, students’ total viewing time and messages
Computer Science is presented in Section 6, followed created were not significant.
498 TEM Journal – Volume 11 / Number 2 / 2022.
TEM Journal. Volume 11, Issue 2, pages 497‐505, ISSN 2217‐8309, DOI: 10.18421/TEM112‐01, May 2022.
In [13], authors use time spent for accessing online Training and Practice step: Online training
learning materials (video, slides, etc.) to classify activities are web-based activities accessible via the
students into different clusters. They found that Internet that are available 24/7 up to a prefixed day
behaviours of accessing online learning materials on the digital platform. So, students can access the
were associated with learning performance. More online training activity whenever they want, use the
precisely, the students who invested more time and time they need and repeat the online training activity
effort in viewing the online learning materials had as many times as they wish. As a counterpart to face-
better learning performance. to-face training activities lack the direct support of
In [17], authors use k-mean clustering algorithm to
the teacher, where the teacher provides the
examine if students’ interaction with different online
appropriate coaching and supervision while the
learning activities affects students’ learning
performances, motivation, and self-regulated learning activity is running in the classroom. Hence, it is
strategies. Their results conclude that the students essential to give specific and easy-to-understand
who spend more time in the learning activities got instructions on how to carry out the activity.
higher academic outcomes. Validation step: After the training step the
Nevertheless, at this point we have to highlight students should have acquired new skills and
that the conclusions drawn in [15], [16] contradict knowledge. Therefore, the validation step allows
the abovementioned results in [13], [17]. These facts them to check their improvement. At this point, the
may be due to the analysis were done under different student receives the first quantifiable performance
contexts, type of courses and students backgrounds indicators about his or her progress. This step can be
[24], [17]. In any case, these facts underline the need an important turning point for students, being a very
for further studies to establish some baselines and stimulating phase in the learning process or just the
avoid the contradictions, as well as, to set up a set of opposite. The parameters collected in this step are
objectively verifiable indicators for the teaching- related to short-term learning. In this paper, this term
learning process. For this reason, this paper presents will be defined as the learning process over the
the results obtained for a use case using a generic course of a single session or during the training
methodology based on the analysis of online activity period of a single online activity.
data for the short-term and long-term. Final step: This is the last step of the in-class
role and provides information about the long-term
3. Development and Evaluation of Short-term learning process. Data collection includes both
and Long-term Online Activities cognitive and affective outcomes.
The methodology applied in this paper to model The students will make theoretical questionnaires
and analyze the effectiveness of online activities and practices exercises at the end of the course for
using Moodle LMS includes a complete life cycle measuring the cognitive outcome and will complete
that combines two main roles: the in-class and the post-test questionnaires for affective outcomes.
off-stage. Figure 1 details the steps involved in the The off-stage role is characterized by the fact that
development of the online activities. Next, we will the student does not actively participate in it,
explain all the involved steps in detail. although it is crucial for the achievement of the
objectives for which the online activity is carried out.
The off-stage role includes the design and
implementation, the analysis and the feedback step.
Design and implementation step: Learning
activities are created and designed to achieve
learning objectives. So, in this step the teacher
addresses the challenges of analyzing the capacity
and feasibility of the online activities to work on the
learning outcomes: the complexity, the context and
the timeline of the activity have to be taken into
account. It is compulsory to examine carefully the
previous student’s understanding on the topic, as well
as, the activity complexity to reduce the negative
Figure 1. Proposed methodology to develop online outcome of poor performance. Then, a well-
activities structured support has to be developed before starting
The in-class role comprises those stages in which the learning activity. Both the required theoretical
the student participates actively even though they are knowledge and a detailed description of how to do
conducted online. More precisely, the in-class role the activity have to be provided to the student in
includes the training, the validation, and the final order to successfully achieve the goals of the activity.
step. Moreover, an updated scheduling timetable has to be
TEM Journal – Volume 11 / Number 2 / 2022. 499
TEM Journal. Volume 11, Issue 2, pages 497‐505, ISSN 2217‐8309, DOI: 10.18421/TEM112‐01, May 2022.
presented to the students in order to know in which 4. Collecting Parameters for Data Mining
sessions the activity will be running and also some
important deadlines. For example, if the activity Improving the efficiency of teaching activities
requires the production of a deliverable, the tackles the definition of suitable metrics. It is
deliverable have to always be uploaded to the LMS necessary to collect some quantifiable parameters to
on time. Setting deadlines for activities in advance is measure both the students’ behavior and performance
especially important if they involve extra work that denote the level of success. Examples of these
outside of class time. To sum up, the design and parameters may be the percentage of tasks completed
implementation step is a key phase in the off-stage or the score obtained in a test. Specifically, the
role that includes: parameters employed in this research have been
classified considering their target end-goal as
The teacher’s training period in which the
cognitive and behavioral. Cognitive parameters are
teacher searches for new teaching approaches and
directly related to academic scores or grades and,
learns the details of using new digital tools.
behavioral parameters are related to the student’s
The design period in which the teacher plans
commitment and attitude during the course. Among
the activities considering the learning objectives and
the parameters used to analyze the student behavior
the time available for the activity. The teacher also
we have collected:
has to design the observable parameters for further
The Trial (T) is measured as the number of times
analysis. A detailed description will be provided in
as student completes an activity. Note that some
subsection 4.
online quizzes can be done as many times as students
Finally, in the get-ready step the teacher
need but some just-in-time teaching activities can be
prepares all the materials. This step may involve for
done only once.
example the access to a digital platform, prepare
The Reinforcement (Re) computes the number of
some hints and prizes and/or design a multiple-
trials done by the student after having obtained the
choice test.
maximum score in the activity.
Analysis step: Both a statistical exploration
The Readiness (R) is a measure of the student’s
of data and DM techniques can be used to discover
willingness to do. More precisely, in this work
useful information which will allow us to improve
readiness has been calculated as the time the student
the scheduling and the design of the activity. The
takes the task in-advance of the delivery time.
first steps of the DM analysis include data collection
The learning Velocity (V) is measured as the
and pre-processing. In some cases, data collection for
number of trials done by the student to obtain the
students requires knowledge of applicable personal
maximum mark. The parameter reflects in some way
privacy laws and regulations. In the proposed data
the learning capacity of the student. It can be a useful
analysis, just after the pre-processing, a clustering
metric to give personal assistance to the student.
technique is combined with classification techniques
The Cost (C) is measured as the time employed for
to observe students’ patterns and accurately predict
each student to complete the learning activity.
important features. In short, data is transformed into
The Participation (P) is a global metric that
useful information for decision-making. More details
measures the total number of students that participate
of the proposed data analysis are provided in Section
in a teaching-learning activity.
5.
To measure the cognitive outcomes, we have
Feedback step: Based on the reports obtained
collected:
in the previous step, we will detect and even
The Initial Score (IS) provides a quantitative
anticipate possible drifts in the teaching-learning
measure of the student’s initial level of knowledge on
process. It is about predicting deviations and non-
the topic. A high score denotes a high probability that
effective performance in the classroom, as well as
the student already had the targeted skills before
decreasing the dropout rate. The effectiveness quality
starting the activity while a low value indicates the
of the activity is improved by applying successive
opposite.
refinements that consider the feedback obtained from
The Recent Score (RS) measures the short-term
the data analysis. The key idea is to adapt the
knowledge acquired by the student. RS takes into
teaching-learning activity solving the detected
account what students know at the end of a training
problems and reinforcing the mechanisms that do
and practice step without considering the mistakes
work. It is important to note that the feedback step
made during that learning process. This metric
may not be the last one of the season. Data analysis
should be collected as soon as the practice step
can trigger a reactive mechanism on a continuous
concludes.
basis while the activity is in progress. For example, it
The Final Score (FS) computes the long-term
would be possible to collect and analyze information
knowledge. The key idea is to tackle the final
during the training step.
activity, probably at the end of the course, to measure
500 TEM Journal – Volume 11 / Number 2 / 2022.
TEM Journal. Volume 11, Issue 2, pages 497‐505, ISSN 2217‐8309, DOI: 10.18421/TEM112‐01, May 2022.
the knowledge acquired by the students thanks to the are also removed and, all data are filtered to consider
developed activity. only those online activities that have been
The Effectiveness (E) calculates the number of successfully completed. In addition, at this stage, the
students that did not possess the targeted knowledge students’ first and last name is converted into a
before the activity but ended up successfully at the numeric fingerprint that uniquely and anonymously
end. identifies each student.
After that, using a computer spreadsheet
5. DM in the Off-stage Role application, simple calculations were performed to
obtain the quantitative values of the metrics defined
In this section, we outline the steps to be carried in the Section 4. In a course, as many short-term
out to extract value from the data obtained from the online activities (Ai) can be performed as many
students' interaction with the Moodle activities. We learning objectives are defined (N). Therefore, for
want to verify whether the online activities have been each metric defined in Section 4 we will obtain a
designed successfully and therefore have boosted the vector of N values. In our analysis, we have
teaching-learning process. In general, the learning computed a global value for each parameter to avoid
data mining process consists of the following steps: non-significant fluctuations. In general, the global
collection, pre-processing, mining, and evaluation. value of a parameter (labeled as K) is computed as a
Figure 2 outlines the main steps of our approach. weighted (Wi) average of each individual activity
(Ai) according to Eq. 1:
(1)
TEM Journal – Volume 11 / Number 2 / 2022. 501
TEM Journal. Volume 11, Issue 2, pages 497‐505, ISSN 2217‐8309, DOI: 10.18421/TEM112‐01, May 2022.
Then, for the purpose of observing students' work has been called the long-term results of the
patterns we apply some classification techniques. online activities. All these students’ interactions with
The idea is to use the criterion variable (RS or FS) as the online activities are tracked by Moodle and allow
a reference for the algorithm to split the input data set us to extract value from the data through data
into mutually exclusive subgroups. The aim of using analytic, as shown in the next section.
the classification algorithm is to model and predict
IR and IF for each case of the input dataset. Note that Table 1. Centroids of each Cluster (mean,std.dev)
RS is the criterion variable for the short-term
teaching-learning process while FS is used for the
long-term process. The better the students are ranked
on the criterion variable, the higher the validity of the
model. Individual well-known classification
techniques, such as C4.5, LMT, RandomTree or
HoeffdingTree should be explored to increase the a. Data analysis results
final accuracy and precision of the system.
502 TEM Journal – Volume 11 / Number 2 / 2022.
TEM Journal. Volume 11, Issue 2, pages 497‐505, ISSN 2217‐8309, DOI: 10.18421/TEM112‐01, May 2022.
Figure 3-b. Cognitive distribution of students according Figure 4 shows the decision trees obtained for
to the EM algorithm (RF distribution) ClusterAll and Cluster0. Results reflect that R, T and
their standard deviation are the key parameters. A
In Figure 3-b the FS distribution, deeply linked to high value of R reports an Excellent outcome (see
the long-term teaching-learning process, is shown. In Figure 4-a). An important finding is that T is a key
this case, the high percentage of students who do not attribute. All the Fail outcomes are associated to
pass the test stands out. So, results show that over students who were not involved in the training step.
time, learners have forgotten what they learnt in the Another finding is that Re does not appear in the
online training phase. However, students in Cluster2 classification algorithm for any dataset. Moreover, C
are less likely to fail. At this point it should be is not a decisive attribute and, beyond any logical
mentioned that the complexity of the final test is reasoning the classification algorithm shows that
greater than those taken in the short-term learning students with less cost value get better outcomes.
process, as it covers the content of all the topics of
the subject. Although this aspect may affect the
results obtained, the impact of online learning
activities on long-term academic results is limited.
To continue the search of valuable information,
data are analyzed applying decision trees. The results
of this analysis identify which behavioral parameter
has the greatest impact on academic results. Thus, the
identification of inappropriate behavior will allow us
to take corrective educational actions and improve
the quality of teaching. This analysis aims to identify
the behavioral parameters that have the greatest
impact on academic outcomes to be able to apply
corrective measures in future implementations and
improve the quality of online activities. This process
comprises two parts: analyzing the short-term
teaching-learning process which takes into account
RS and the long-term learning process directly
related to the FS attribute. Four datasets are
evaluated: the three datasets obtained from EM
Figure 4-a. Classification of students for the short-term
algorithm called as Cluster0, Cluster1 and Cluster2
learning process using the J48 algorithm (ClusterAll)
TEM Journal – Volume 11 / Number 2 / 2022. 503
TEM Journal. Volume 11, Issue 2, pages 497‐505, ISSN 2217‐8309, DOI: 10.18421/TEM112‐01, May 2022.
Table 3. Output performance of the J48 algorithm for RF 7. Conclusions and Future Work
as classification attribute
Tracking and analyzing students’ activity is critical
in online activities. In this work, behavioral and
cognitive parameters of online activities have been
defined. And techniques based on DM have
uncovered significant patterns in students’ behavior.
These patterns are key to recognizing trends among
students’ work and improving their learning
outcomes. The data analysis shows that integrating
online training activities into the learning process
improves students’ results, especially in the short
term. Moreover, the parameter that most influences
the cognitive outcomes is the number of times a
training activity is completed. It is important to
highlight that the methodology has been clearly
exposed, and open-source tools has been used to
simplify the comparison and replication of similar
experiences for future improvements. The datasets
extracted from our online activities using Moodle
quizzes are available to the scientific community.
We believe that online training activities could be
improved by integrating a recommendation
mechanism generated from the behavioral metrics of
each student and the results obtained in this work.
Therefore, we would like to extend the online
activities with a real-time notification system that
guides students in the in-class role.
504 TEM Journal – Volume 11 / Number 2 / 2022.
TEM Journal. Volume 11, Issue 2, pages 497‐505, ISSN 2217‐8309, DOI: 10.18421/TEM112‐01, May 2022.
TEM Journal – Volume 11 / Number 2 / 2022. 505