D7.4 - Validation 4
Appendix B: Validation Reporting Templates
Table of Contents
Appendix B.1
Appendix B.2
Appendix B.3
Appendix B.4
Appendix B.5
Appendix B.6
Appendix B.6
Appendix B.7
Validation Reporting Template Overview ...................................................................................................................... 2
Validation Reporting Template for WP4.1 (bitmedia & IPP-BAS)................................................................................... 4
Validation Reporting Template for WP4.2 (UNIMAN and OUNL) ................................................................................. 64
Validation Reporting Template for WP5.1 (PUB-NCIT and UNIMAN) ........................................................................ 120
Validation Reporting Template for WP5.2 (UPMF / CNED) ....................................................................................... 182
Validation Reporting Template for WP6.1 (IPP-BAS & Sofia University) ................................................................... 230
Validation Reporting Template for WP6.2 (PUB-NCIT & UU) .................................................................................... 270
Validation Reporting Template for Long Thread (OUNL, AURUS & PUB-NCIT)........................................................ 320
This appendix provides the full validation reporting templates (VRTs), with the exception of the pilot sites, courses and participants
(part of Section 2) which is provided in Appendix A.3. The appendix begins with an overview describing the layout of the VRTs.
Page 1 of 349
D7.4 - Validation 4
Appendix B.1 Validation Reporting Template Overview
The finalized validation template for Round 3 comprises the following sections:
Section 1:
Section 2:
Section 3:
Section 4:
Section 5:
Section 6:
Section 7:
Functionality implemented in Version 1.5 or the Long Thread,
alpha and beta testing
Validation pilot overview
Results – validation/verification of Validation Topics
Results – validation activities informing future changes /
enhancements to the system
Results – validation activities informing transferability,
exploitation and barriers to adoption
Conclusions
Roadmap (to pass to D2.5)
Section 1 describes the changes to the software in Version 1.5, compared with Version 1.
Section 2 describes the pilot environment: participants, language, pilot task to be completed, summary details of verification
and other experiments.
Sections 3 – 5 provide the results. Partners were asked to provide the results scientifically, without discussion ("the results
should speak for themselves"), to allow the WP7 team and others to take as unbiased a view as possible.
Section 6 provides the conclusions under four headings:
Conclusions on whether the validation topics have been validated, within the limitations of the methodologies used
Page 2 of 349
D7.4 - Validation 4
A SWOT analysis (Strengths, Weaknesses, Opportunities and Threats) considering the objective "<LTfLL service>
(v1.5) will be adopted in pedagogic contexts beyond the end of the project".
A short overall conclusion regarding the likelihood of adoption of v.1.5 of the service (as informed by the SWOT).
The most important future actions to promote adoption of the service (as informed by the SWOT). These should be
carried into the Roadmap section 7.
Section 7 provides the roadmap to be passed to D2.5. The roadmap is in five sections, addressing:
Future enhancements to the system
Changes to scenarios of use
Possible additional educational contexts for deployment
The most important issues for future technical research to enable deployment of language technologies in educational
contexts
Further validation planned for beyond the end of the project
This deliverable seeks to answer specific questions concerning exploitation and the roadmap. This led to a decision to limit
partners' discussions of items in the VRTs, though the WP7 team recognizes the wealth of data that could be discussed in
more depth in future papers. Accordingly, partners were asked to be as brief as possible and to draw very specific
conclusions concerning exploitation and the roadmap (only) from their data.
Because Section 3 (Results for validation topics) provides data to be categorised as validated / validated with qualifications /
not validated, it was particularly important that categorisation teams should not be influenced by discussion in this section.
Page 3 of 349
D7.4 - Validation 4
Appendix B.2 Validation Reporting Template for WP4.1 (bitmedia & IPP-BAS)
Verification data was provided by WUW.
Section 1: Functionality implemented in Version 1.5 and alpha / beta-testing record
Brief description of functionality
Version
number of unit
Changes from Version 1.0
Short thread (4.1 - 6.1)
v1.5
The domain ontology in IT, present in 6.1, was made available also for
stakeholders of 4.1. It gives the concept coverage within learners‟ answers in
the live feedback (matched, missing, and additional) through concept
annotation and comparison. The tutor can also mark the information he/she
agrees with to improve the results of the service. Semantic search can be used
during the creation of the questionnaire for selection of appropriate learning
objects. Manual addition of concept annotation can be used to enrich the
lexicalization of the ontology.
Lexicons and annotation grammars for German and Bulgarian languages have
been added. They have been related to the ontology in the appropriate way.
Annotation service
v1.5
Provided concept annotation for learning materials in Bulgarian in addition to
English enriched Information Technologies ontology and related lexicons for
the two languages. As a consequence of the improved concept availability,
tutors are given a better choice when selecting representative conceptual
information on a given topic and on the other hand learners are provided with
more exhaustive model of the particular knowledge domain (vocabulary and
notions).
Live feedback component
v1.5
Integration of knowledge rich (KR) and knowledge poor (KP) approach for
different languages - The integration of KR and KP approach has been
enhanced for the use in the German pilot.
Page 4 of 349
D7.4 - Validation 4
The combination of the two results adds one more layer of knowledge
representation for both the learning materials and the learners' answers. The
former ensures the concept information, while the latter – the language
expressions in texts. Tutors can prepare for each question a fixed list of
concepts that s/he considers obligatory for an answer to be satisfactory.
Learners can use this information, which is part of the Live Feedback, to learn
new concepts, to find information about them and to improve their answers.
Lexicalisations update
v1.5
Added functionality that provides means for the tutor to add new language
expressions for the domain specific concepts and thus to affect text annotation
in both learning materials and learners answers. According to his/her
understanding s/he can supplement the list of lexical items, corresponding to a
given concept, with new terms or terms that for some reason were not included
in the original lexicon.
Alpha-testing
Pilot site and language
Bitmedia (German)
Date of completion of alpha testing:
28 October 2010
Who performed the alpha testing?
bitmedia (Christoph Mauerhofer, Wolfgang Maierl)
Pilot site and language
IPP-BAS (Bulgarian)
Date of completion of alpha testing:
12 October 2010
Who performed the alpha testing?
Alexander Simov, IPP-BAS
Beta-testing
Page 5 of 349
D7.4 - Validation 4
Pilot site and language: bitmedia (Wien)
Has stakeholder validation taken place using the service embedded in Elgg (Yes/No/Partially): No
If ‘No’ or ‘Partially’, give reasons:
The primary aspect for the beta-testing was to ensure the quality of the service results, which are
independent of the embedded and standalone version. During the validation we used the standalone
version to avoid different user interface layouts based on the available customizations in the widget
version.
The widgetised version was used for the presentations and dissemination activities and will be the only
version used for further activities.
beta-testing performed by:
Barbara Busch, Hans Kudy (bitmedia Wien)
beta testing environment (stand-alone service / integrated into Elgg): stand-alone service
HANDOVER DATE:
17.11.2010
(Date of handover of software v.1.5 for validation)
Pilot site and language: IPP-BAS (Bulgarian)
Has stakeholder validation taken place using the service embedded in Elgg (Yes/No/Partially): Yes
If ‘No’ or ‘Partially’, give reasons:
beta-testing performed by:
Laska Laskova, Stanislava Kancheva (IPP-BAS)
beta testing environment (stand-alone service / integrated into Elgg): Elgg
HANDOVER DATE:
17.11.2010
(Date of handover of software v.1.5 for validation)
Page 6 of 349
D7.4 - Validation 4
Section 2: Validation Pilot Overview
NB Information about pilot sites, courses and participants has been transferred to Appendix A.3
Pilot task
Pilot site:
bitmedia Austria
Pilot language: German
What is the pilot task for learners and how do they interact with the system?
Learners answer a set of questions on topics from the course domain (Introduction to IT). They receive an instant live feedback from the system so
they do not have to wait for the tutors‟ reaction to get recommendations for improving their knowledge, if needed.
The tutors are advised to use the short thread scenario to add annotations to the concept.
What do the learners produce as outputs? Are the outputs marked?
The output is a short text, answer to an open question. The output is graded from 0 to 100 (done by the system based on the existing grading of
tutors for different answers).
How long does the pilot task last, from the learners starting the task to their final involvement with the software?
During the first part of the learners' educational path, the LTfLL project team explained to the tutors and learners involved the goals of the LeaPos
Service, the basic usage of the user interface and the difference between phrases and concepts.
The further usage of the service was handed over to the individual tutors and learners. The learners were able to use the LeaPos Service at an
individualized time during the following days or immediately after the introduction to the LeaPos Service based on their individual time
management. The LeaPos Service was used one or two times per learner to get feedback from the system.
About one and half or two weeks later a final session for collecting the feedback and the individual interview‟s took place.
During the two weeks period the learners were involved in their defined learning path and were allowed to use different tools for reaching their
learning goals (e-Learning Content, lab materials, LeaPos-Service, printed learning materials, tutors help, ECDL pretesting system, …).
The tutors where able to use the short thread scenario to add annotations to the concept.
Page 7 of 349
D7.4 - Validation 4
How do tutors/student facilitators interact with the learners and the system?
Tutors select questions, relevant concepts, phrases and learning materials for these questions. In addition the short thread functionality for the
annotation approach is used. They assess and comment learners‟ answers after considering live feedback information.
Describe any manual intervention of the LTfLL team in the pilot:
No manual intervention.
Pilot site:
Sofia University
Pilot language: Bulgarian
What is the pilot task for learners and how do they interact with the system?
Learners answer a set of questions on random topics from the course domain (Introduction to IT). They receive an instant live feedback from the
system so they do not have to wait for the tutors‟ reaction to get recommendations for improving their knowledge, if needed.
What do the learners produce as outputs? Are the outputs marked?
The output is a short text, answer to an open question. The output is graded from 0 to 100 (done by the system based on the existing grading of
tutors for different answers).
How long does the pilot task last, from the learners starting the task to their final involvement with the software?
Two weeks time span from the learners‟ log in the system (obligatory) through the tutor‟s final grading of the answers (obligatory), and then again
students‟ improved answers (optional). Since the students shared an opinion that they would need more flexible times for performing the task, the
time and the number of corrections for the answers were not fixed. For that reason it was not surprising that feedback was also received after the
two-week period.
How do tutors/student facilitators interact with the learners and the system?
Tutors select questions, relevant concepts, phrases and learning materials for these questions. They assess and comment learners‟ answers after
considering live-feedback information.
Describe any manual intervention of the LTfLL team in the pilot:
No manual intervention.
Page 8 of 349
D7.4 - Validation 4
Page 9 of 349
D7.4 - Validation 4
Section 3: Results - validation/verification of Validation Topics
OVT:
1.1
Pilot site
bit Austria
Pilot language
German
Operational Validation Topic
Absolute value of score The tutors/experts find that When the score given by the system is compared with the
score given by the tutor, the difference between the two values is small.
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology:
Measure: Pearson correlation of system scores against human/tutor scores. Based on 303 graded answers and 10 questions.
Scoring algorithm: Weighted scoring - Take closest 3 answers and calculate weighted (by cosine distance) average grade. For testing purposes
use N-1 when searching for closest answers.
Results:
Correlation*: 0.57
Training process: Automated best dimension identification: Given the ranking formula, calculate the ideal number of space dimensions to be used
that achieves the highest possible correlations on a per-question basis.
* Correlations range from 0.25 to 0.68 depending on the question.
Formative results with respect to validation indicator, including quotations
Stakeholder type
Results
Tutors / Interview
The absolute value of the score provides useful information of the learner‟s knowledge.
Deviations in the scoring provided by the LeaPos Service are lower compared with the deviations generated by traditional
scoring based on human estimation.
Page 10 of 349
D7.4 - Validation 4
OVT:
1.1
Pilot site
IPP-BAS
Pilot language
Bulgarian
Operational Validation Topic
Absolute value of score The tutors/experts find that When the score given by the system is compared with the
score given by the tutor, the difference between the two values is small.
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology:
Measure: Pearson correlation of system scores against human/tutor scores. Based on 274 graded answers and 10 questions.
Scoring algorithm: Weighted scoring - Take closest 3 answers and calculate weighted (by cosine distance) average grade. For testing purposes
use N-1 when searching for closest answers.
Results:
Correlation*: 0.63
Training process: Automated best dimension identification: Given the ranking formula, calculate the ideal number of space dimensions to be used
that achieves the highest possible correlations on a per-question basis.
* Correlations range from 0.26 to 0.85 depending on the question.
OVT:
1.2
Pilot site
bit Austria
Pilot language
German
Operational Validation Topic
Relative value of score The tutors/experts find that When a learner has improved his/her answer, as judged by
the tutor, an increase in the live feedback score is observed consistently.
Formative results with respect to validation indicator, including quotations
Stakeholder type
Results
Tutors / Interview
“The improvement of the answers given by the learners is continuously reflected by the feedback score.”
“The increments of the feedback score for improved answers are differing for similar improvements and not absolutely
consistent”.
Page 11 of 349
D7.4 - Validation 4
OVT:
1.3
Pilot site
bit Austria
Pilot language
German
Operational Validation Topic
Knowledge Poor feedback: The tutors/experts find that A high proportion of the phrases in the two columns
(positive, missing) are judged as being correct feedback.
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology:
Measure: Pearson correlation of cumulated phrase scores against human/tutor scores. The qualitative output of the phrase list was quantified by
summing up the underlying phrase scores of the identified phrases in the answer, which are used to determine which phrases are displayed to the
user. Based on 303 graded answers and 10 questions.
Individual phrase scoring formula: log(grade_sum+1) * ridf
Results:
Correlation*: 0.43
* Correlations range from 0.16 to 0.59 depending on the question. Note that this is an artificial value to quantify a qualitative list of detected
phrases and has little if any direct implication on whether the list is useful to the learner/tutor or not.
Page 12 of 349
D7.4 - Validation 4
OVT:
1.3
Pilot site
IPP-BAS
Pilot language
Bulgarian
Operational Validation Topic
Knowledge Poor feedback: The tutors/experts find that A high proportion of the phrases in the two columns
(positive, missing) are judged as being correct feedback.
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology:
Measure: Pearson correlation of cumulated phrase scores against human/tutor scores. The qualitative output of the phrase list was quantified by
summing up the underlying phrase scores of the identified phrases in the answer, which are used to determine which phrases are displayed to the
user. Based on 274 graded answers and 10 questions.
Individual phrase scoring formula: log(grade_sum+1) * ridf
Results:
Correlation*: 0.5
* Correlations range from 0.28 to 0.74 depending on the question. Note that this is an artificial value to quanitify a qualitative list of detected
phrases and has little if any direct implication on whether the list is useful to the learner/tutor or not.
OVT:
1.4
Pilot site
bit Austria
Pilot language
German
Operational Validation Topic
Knowledge Rich feedback: The tutors/experts find that A high proportion of the concepts in the two columns
(common, missing, additional) are judged as being correct feedback.
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology:
Measure: Pearson correlation of concept scores against human/tutor scores. The qualitative output of the concept list was quantified by summing
up the number of the identified concepts in the answer. Based on 303 graded answers and 10 questions.
Results:
Page 13 of 349
D7.4 - Validation 4
Correlation*: 0,33
* Correlations range from 0.1 to 0.6 depending on the question. Note that this is an artificial value to quanitify a qualitative list of detected concepts
and has little if any direct implication on whether the list is useful to the learner/tutor or not.
OVT:
1.4
Pilot site
IPP-BAS
Pilot language
Bulgarian
Operational Validation Topic
Knowledge Rich feedback: The tutors/experts find that A high proportion of the concepts in the two columns
(common, missing, additional) are judged as being correct feedback.
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology:
Measure: Pearson correlation of concept scores against human/tutor scores. The qualitative output of the concept list was quantified by summing
up the number of the identified concepts in the answer. Based on 303 graded answers and 10 questions. Based on 274 graded answers and 10
questions.
Results:
Correlation*: 0.35
* Correlations range from 0.1 to 0.75 depending on the question. Note that this is an artificial value to quanitify a qualitative list of detected
concepts and has little if any direct implication on whether the list is useful to the learner/tutor or not.
OVT:
2.1
Pilot site
bit Austria
Pilot language
German
Operational Validation Topic
Tutors spend less time preparing final feedback for learners and grading compared with traditional
means.
Summative results with respect to validation indicator
Page 14 of 349
D7.4 - Validation 4
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Tutors / Interview
Results:
Average time for the tutor-learner feedback session
(calculated with 10 feedback sessions for “Traditional” and “Using the LeaPos Service”) :
Traditional: about 24 min for each learner
Using the LeaPos Service: about 18 min for each learner
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
7. It takes less time to complete my teaching tasks using LeaPos than
without the system.
Experimental
3,4
1,18
45%
5
Tutors
8. Using LeaPos enables me to work more quickly than without the
system.
Experimental
3,3
1,14
48%
5
Tutors
9. I do not wait too long before receiving the requested information.
Experimental
2,4
1,22
18%
5
Tutors
10. LeaPos provides me with the requested information when I require it
(i.e. at the right time in my work activities).
Experimental
3,3
0,99
45%
5
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
“The service is an additional tool to enable the learners themselves to find out their missing knowledge.”
Tutors
“The response time of the tool is good.”
Tutors
“The overall quality of the feedback is good enough to help the learners and to support new tutors.”
Tutors
“Using the existing feedback of the LeaPos Service the tutor is able to jump into the positioning process at a higher level. So
it‟s possible to ask more specific questions for each learner right from the beginning of the session and save time.”
Comment
Each tutor was responsible for a group of learners, where some of them used the positioning service and the other learners
Page 15 of 349
D7.4 - Validation 4
worked in traditional means. The results of the comparison where used for the time analysis.
Page 16 of 349
D7.4 - Validation 4
OVT:
2.1
Pilot site
IPP-BAS
Pilot language
Bulgarian
Operational Validation Topic
Tutors spend less time preparing final feedback for learners and grading compared with traditional
means.
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
7. It takes less time to complete my teaching tasks using LeaPos than
without the system.
Experimental
4,0
1,00
67%
3
Tutors
8. Using LeaPos enables me to work more quickly than without the
system.
Experimental
4,7
0,58
100%
3
Tutors
9. I do not wait too long before receiving the requested information.
Experimental
4,7
0,58
100%
3
Tutors
10. LeaPos provides me with the requested information when I require it
(i.e. at the right time in my work activities).
Experimental
4,7
0,58
100%
3
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
“The main advantage of the system is that it saves time. For some of the questions it was enough for me only to flip through
the lists of phrases and concepts to decide on the grade.”
Tutors
“For me it is a nice to have a repository with all the necessary information at one place”
Tutors
“Feedback is immediate, that is very important.”
Tutors
“Sometimes it takes me some time to get oriented within the list of phrases and concepts, and then to connect it to the Live
Feedback measure.”
Page 17 of 349
D7.4 - Validation 4
OVT:
2.2
Pilot site
bit Austria
Pilot language
German
Operational Validation Topic
It is easy (there is less cognitive load) for tutors to provide feedback and grading using LeaPos.
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
70. Using LeaPos, grading learners takes a lot of mental effort.
Experimental
2,0
0,82
0%
5
Tutors
71. Using LeaPos, it takes a lot of mental effort to provide feedback for
learners.
Experimental
2,0
0,82
0%
5
Tutors
72. Using the output provided by LeaPos, it is easy for new tutors to
provide feedback and grading in the BIT training environment.
Experimental
3,7
0,58
60%
5
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
“It‟s easy to use the LeaPos Service”
Tutors
“I expect that the service is useful for new tutors”
Tutors
“The user interface could be improved – the relevant information is mostly on the end of the page”
OVT:
2.2
Pilot site
IPP-BAS
Pilot language
Bulgarian
Operational Validation Topic
It is easy (there is less cognitive load) for tutors to provide feedback and grading using LeaPos
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
70. Using LeaPos, grading learners takes a lot of mental effort.
(1=strongly agree, 5= strongly disagree)
Experimental
2,0
1,00
66, 7%
3
Page 18 of 349
D7.4 - Validation 4
Tutors
71. Using LeaPos, it takes a lot of mental effort to provide feedback for
learners. (1=strongly agree, 5= strongly disagree)
Experimental
2,0
1,00
66, 7%
3
Tutors
72. Using the output provided by LeaPos, it is easy for new tutors to
provide feedback and grading in our environment.
Experimental
3,3
0,58
33%
3
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
"It is much easier to find the gaps in learner‟s knowledge exactly because of the structured output from the system – it is fast to
read and easy to use as a starting point for my feedback.”
OVT:
3.1
Pilot site
bit Austria
Questionnaire
type
Pilot language
German
Operational Validation Topic
Tutors perceive that the feedback received from the system helps them prepare feedback for learners.
(relevant, useful, accurate, trustworthy).
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Relevant
Tutors
73. LeaPos provides feedback that is relevant to my preparation of
learner feedback.
Experimental
4,0
1,0
60%
5
Tutors
74. LeaPos provides feedback that is relevant to learners.
Experimental
4,7
0,58
60$
5
Useful
Tutors
75. LeaPos provides feedback that is useful to my preparation of learner
feedback.
Experimental
4,3
0,58
60%
5
Tutors
76. The “List of Phrases” (used and missing) provided by the system is
helpful to me in preparing learner feedback.
Experimental
4,0
0,00
60%
3
Page 19 of 349
D7.4 - Validation 4
Tutors
77. I perceive that the “List of Phrases” (used and missing) would help
learners in their studies.
Experimental
3,7
0,58
40%
5
Tutors
78. The “List of Concepts” (used and missing) provided by the system is
helpful to me in providing learner feedback.
Experimental
4,0
0,00
60%
3
Tutors
79. I perceive that the “List of Concepts” (used and missing) would help
learners in their studies.
Experimental
4,7
0,58
60%
5
Accurate
Tutors
80. LeaPos feedback is sufficiently accurate to inform my feedback.
Experimental
4,3
0,58
60%
5
Tutors
81. The “Grading (percentage value)” in the live-feedback represents an
overview of the current position of the learner.
Experimental
3,4
1,67
40%
5
Tutors
82 The “List of Phrases” (used and missing) provided by the system is
mostly correct.
Experimental
3,6
0,55
60%
5
Tutors
83. The “List of Concepts” (used and missing) provided by the system is
mostly correct.
Experimental
4,3
0,50
80%
5
Trustworthy
Experimental
84. I trust LeaPos to provide helpful feedback.
Experimental
3,7
0,58
60%
5
Tutors
Formative results with respect to validation indicator
Stakeholder type
Results
Page 20 of 349
D7.4 - Validation 4
Tutors
“The Grading is a useful hint for the tutor – more important is the list of concepts to decide if relevant knowledge is missing”
“It‟s helpful to have a look at the missing phrases to recognize missing knowledge of the learner”
“There are some missing feedback elements for some questions – e.g: „Name at least three data storage devices and describe
their properties‟.
'For this question the learners got the information about missing storage devices but no hints for the properties of these
devices.”
“If we found missing feedback in the list of phrases it was possible to identify the reason: Not all tutors expect the same
phrases in the answers, so that these phrases were not available in the existing „gold standard‟ answers”.
Page 21 of 349
D7.4 - Validation 4
OVT:
3.1
Pilot site
IPP-BAS
Questionnaire
type
Pilot language
Bulgarian
Operational Validation Topic
Tutors perceive that the feedback received from the system helps them prepare feedback for learners.
(relevant, useful, accurate, trustworthy).
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Relevant
Tutors
73. LeaPos provides feedback that is relevant to my preparation of
learner feedback.
Experimental
4,3
0,58
100%
3
Tutors
74. LeaPos provides feedback that is relevant to learners.
Experimental
4,3
0,58
100%
3
Useful
Tutors
75. LeaPos provides feedback that is useful to my preparation of learner
feedback.
Experimental
4,3
0,58
100%
3
Tutors
76. The “List of Phrases” (used and missing) provided by the system is
helpful to me in preparing learner feedback.
Experimental
4,0
0,00
100%
3
Tutors
77. I perceive that the “List of Phrases” (used and missing) would help
learners in their studies.
Experimental
3.3.
0.58
33%
3
Tutors
78. The “List of Concepts” (used and missing) provided by the system is
helpful to me in providing learner feedback.
Experimental
4,0
0,00
100%
3
Tutors
79. I perceive that the “List of Concepts” (used and missing) would help
learners in their studies.
Experimental
4,0
0,00
100%
3
Experimental
4,0
0,00
100%
3
Accurate
Tutors
80. LeaPos feedback is sufficiently accurate to inform my feedback.
Page 22 of 349
D7.4 - Validation 4
Tutors
81. The “Grading (percentage value)” in the live-feedback represents an
overview of the current position of the learner.
Experimental
3,7
0,58
67%
3
Tutors
82 The “List of Phrases” (used and missing) provided by the system is
mostly correct.
Experimental
3,7
0,58
67%
3
Tutors
83. The “List of Concepts” (used and missing) provided by the system is
mostly correct.
Experimental
4,0
0,00
100%
3
Experimental
4,0
0,00
100%
3
Trustworthy
Tutors
84. I trust LeaPos to provide helpful feedback.
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
“The links to learning materials are very useful. I like that the system provides links to relevant texts at any moment of my –
and learners‟ – interaction with it.”
Tutors
“The list of phrases, both missing and positive, could be used to teach learners some of the typical for the professional
language terms and expressions.”
Tutors
“I could evaluate the relevance of my own learning materials in the following way: inserting them as answers to a given
question and then receiving the live feedback from the system.”
OVT:
3.2
Pilot site
bit Austria
Pilot language
German
Operational Validation Topic
Learners perceive that the live feedback received from the system contributes to informing their study
activities.
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Learners
6. The information the system provides to me is accurate enough for
Experimental
3,4
1,18
52%
25
Page 23 of 349
D7.4 - Validation 4
helping me to perform my learning tasks.
Learners
50. LeaPos provides feedback that is relevant to my study activities.
Experimental
3,8
0,95
56%
25
Learners
51. LeaPos provides feedback that is useful to my study activities.
Experimental
4,0
0,92
52%
25
Learners
52. The “List of Phrases” (used and missing) provided by the system is
helpful.
Experimental
3,9
0,83
60%
25
Learners
53. The “List of Concepts” (used and missing) provided by the system is
helpful.
Experimental
4,1
1,01
56%
25
Learners
54. LeaPos feedback is sufficiently accurate to inform my study activities.
Experimental
4,1
0,87
68%
25
Learners
55. The “List of Phrases” (used and missing) provided by the system is
mostly correct.
Experimental
3,9
0,97
64%
25
Learners
56. The “List of Concepts” (used and missing) provided by the system is
mostly correct.
Experimental
3,8
1,08
52%
25
Learners
57. I trust LeaPos to provide helpful feedback.
Experimental
3,9
1,04
68%
25
Formative results with respect to validation indicator
Stakeholder type
Results
Learners
“There are some wrong results in the list of phrases and concepts – improvement will be helpful for the learner – it‟s not an
issue for the tutor”
Learners
“To provide two different lists of text is not useful for the learner – the learner is not able to distinguish between phrases and
concepts”
OVT:
3.2
Pilot site
Sofia
University
Pilot language
Bulgarian
Operational Validation Topic
Learners perceive that the live feedback received from the system contributes to informing their study
activities (relevant, useful, accurate, trustworthy).
Page 24 of 349
D7.4 - Validation 4
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Relevant
Learners
6. The information the system provides to me is accurate enough for
helping me to perform my learning tasks.
Experimental
3,8
0,62
72%
25
Learners
50. LeaPos provides feedback that is relevant to my study activities.
Experimental
3,9
0,67
80%
25
Useful
Learners
51. LeaPos provides feedback that is useful to my study activities.
Experimental
4,0
0,54
88%
25
Learners
52. The “List of Phrases” (used and missing) provided by the system is
helpful.
Experimental
3,8
0,62
72%
25
Learners
53. The “List of Concepts” (used and missing) provided by the system is
helpful.
Experimental
3,8
0,55
76%
25
Accurate
Learners
54. LeaPos feedback is sufficiently accurate to inform my study activities.
Experimental
3,8
0,66
72%
25
Learners
55. The “List of Phrases” (used and missing) provided by the system is
mostly correct.
Experimental
3,8
0,65
68%
25
Learners
56. The “List of Concepts” (used and missing) provided by the system is
mostly correct.
Experimental
3,9
0,53
80%
25
Experimental
3,8
0,62
72%
25
Trustworthy
Learners
57. I trust LeaPos to provide helpful feedback.
Formative results with respect to validation indicator
Stakeholder type
Results
Learners
“I‟d like to see some ranking of the concepts - which are more important and which not that important.”
Page 25 of 349
D7.4 - Validation 4
Learners
“Some of the phrases were actually just variations of one and the same phrase”
Learners
“It is really nice that you can not only to see the missing concepts, but you can also learn what they mean and see where to
get more information about them.”
Page 26 of 349
D7.4 - Validation 4
OVT:
3.3
Pilot site
bit Austria
Pilot language
German
Operational Validation Topic
Learners perceive that they receive useful additional feedback, compared with traditional means
Experimental
/ control
group
Questionnaire
type
Questionnaire no. & statement
Learners
58. It is useful to get extra feedback from LeaPos, in addition to the
tutor‟s feedback.
Experimental
Learners
59. Receiving feedback from LeaPos in addition to the tutor feedback
provides me with more detailed feedback (compared with the tutor
feedback I got in the last course without using the Positioning Service).
Experimental
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
3,5
1,17
36%
25
3,3
1,20
52%
25
Formative results with respect to validation indicator
Stakeholder type
Results
Learners
“It‟s interesting to use the live feedback to find information for learning”
Learners
“I don‟t believe that I got additional feedback, but I got immediately a feedback”
OVT:
3.3
Pilot site
Sofia
University
Pilot language
Bulgarian
Operational Validation Topic
Learners perceive that they receive useful additional feedback, compared with traditional means
Experimental
/ control
group
Questionnaire
type
Questionnaire no. & statement
Learners
58. It is useful to get extra feedback from LeaPos, in addition to the
tutor‟s feedback.
Page 27 of 349
Experimental
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
4,3
0,79
80%
25
D7.4 - Validation 4
Learners
59. Receiving feedback from LeaPos in addition to the tutor feedback
provides me with more detailed feedback (compared with the tutor
feedback I got in the last course without using the Positioning Service).
4,1
0,71
76%
25
Experimental
Formative results with respect to validation indicator
Stakeholder type
Results
Learners
“I want to know beforehand if I can give a one word answer, or I have to justify my choice with some explanation to get the
highest grade.”
OVT:
3.4
Pilot site
bit Austria
Questionnaire
type
Pilot language
German
Operational Validation Topic
Learners perceive that the system can target learning materials depending on their needs
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Experimental
3,7
0,93
60%
25
Relevant
Learners
60. LeaPos provides learning materials that are relevant to my study
activities.
Useful
Learners
61. LeaPos provides learning materials that are useful to my study
activities.
Experimental
3,8
0,96
60%
25
Learners
62. LeaPos provides a diversity of hints (phrases, concepts and learning
materials), which are useful for finding appropriate learning materials.
Experimental
3,7
0,93
64%
25
Formative results with respect to validation indicator
Stakeholder type
Results
Learners
“I would expect a detailed information about the relevant topics in the learning materials to save time”
Page 28 of 349
D7.4 - Validation 4
Learners
“It‟s important to have an open mind for the provided hints”
Page 29 of 349
D7.4 - Validation 4
OVT:
3.4
Pilot site
Sofia
University
Questionnaire
type
Pilot language
Bulgarian
Operational Validation Topic
Learners perceive that the system can target learning materials depending on their needs (relevant,
useful, trustworthy)
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Experimental
3,8
0,66
64%
25
Relevant
Learners
60. LeaPos provides learning materials that are relevant to my study
activities.
Useful
Learners
61. LeaPos provides learning materials that are useful to my study
activities.
Experimental
4,0
0,71
64%
25
Learners
62. LeaPos provides a diversity of hints (phrases, concepts and learning
materials), which are useful for finding appropriate learning materials.
Experimental
4,1
0,76
84%
25
Experimental
3,9
0,70
80%
25
Trustworthy
Learners
63. I trust LeaPos to provide helpful learning materials.
Formative results with respect to validation indicator
Stakeholder type
Results
Learners
“One of the best features of the system is that you can access relevant materials immediately, you don‟t have to search for
them on the net”
Learners
“Not all of the documents could be used as a main source, some of them were good only as supplementary materials”
Page 30 of 349
D7.4 - Validation 4
OVT:
4.1
Pilot site
bit Austria
Questionnaire
type
Pilot language
German
Operational Validation Topic
Tutors perceive that positioning is more effective compared with traditional means because the quality
and quantity of the input to positioning is improved.
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Experimental
3,7
0,58
60%
5
Experimental
4,7
0,58
60%
5
Quality
Tutors
84. Using LeaPos, the learner‟s input to positioning is a good reflection of
his/her knowledge.
Quantity
Tutors
85. Using LeaPos, I have enough information about the learner on which
to base my positioning decision.
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
“The LeaPos Service doesn‟t outline the whole position of the learner but is very helpful to accelerate the positioning process”.
Tutors
“The list of concepts was very helpful for the positioning”
Comment
The usage of the LeaPos Service provides efficiency for the positioning service with reduced workload for the tutors compared
to the traditional positioning task.
Page 31 of 349
D7.4 - Validation 4
OVT:
4.1
Pilot site
IPP-BAS
Questionnaire
type
Pilot language
Bulgarian
Operational Validation Topic
Tutors perceive that positioning is more effective compared with traditional means because the quality
and quantity of the input to positioning is improved.
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Experimental
4,0
0,00
100%
3
Experimental
4,3
0,58
100%
3
Quality
84. Using LeaPos, the learner‟s input to positioning is a good reflection of
his/her knowledge.
Tutors
Quantity
Tutors
85. Using LeaPos, I have enough information about the learner on which
to base my positioning decision.
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
“Live-feedback gives a right idea of the learner‟s knowledge”.
Tutors
“The LeaPos Service provides immediately positioning results for the learner which improves the efficiency of learning”
Tutors
“I would like to see the information also in context (other answers to questions; other tasks)”.
OVT:
4.2
Pilot site
bit Austria
Questionnaire
type
Pilot language
German
Operational Validation Topic
Tutors perceive that using LeaPos, learners receive homogeneous feedback
Experimental
/ control
group
Questionnaire no. & statement
Page 32 of 349
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
D7.4 - Validation 4
Tutors
86. Using LeaPos, different tutors would be likely to provide very similar
feedback to the same .learner.
Experimental
4,0
0,00
60%
3
Tutors
87. Using LeaPos, where two learners have the same missing concepts,
they would receive the same hints for finding learning materials.
Experimental
3,3
0,58
20%
5
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
“LeaPos will be helpful to consolidate the provided feedback by different tutors”
Tutors
“The missing concepts are not absolutely representing the missing knowledge”
OVT:
4.2
Pilot site
IPP-BAS
Pilot language
Bulgarian
Operational Validation Topic
Tutors perceive that using LeaPos, learners receive homogeneous feedback
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
86. Using LeaPos, different tutors would be likely to provide very similar
feedback to the same learner.
Experimental
4,0
1,00
67%
3
Tutors
87. Using LeaPos, where two learners have the same missing concepts,
they would receive the same hints for finding learning materials.
Experimental
4,0
1,00
67%
3
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
“I started to grade the answers very quickly after the first 3 or 4 answers. Usually it takes me much more time to adjust my
preliminary grading methodology with regard to the test results.”
Page 33 of 349
D7.4 - Validation 4
OVT:
4.3
Pilot site
bit Austria
Pilot language
German
Operational Validation Topic
Learners can receive feedback when they need it
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Learners
63A. It is helpful to receive immediate feedback from LeaPos when I
need it (no latency time for waiting).
Experimental
4,0
1,07
64%
25
Formative results with respect to validation indicator
Learners
“It‟s exciting to use the live feedback”
Learners
“I would like to use the live feedback functionality for additional modules in the training”
OVT:
4.3
Pilot site
Sofia
University
Pilot language
Bulgarian
Operational Validation Topic
Learners can receive feedback when they need it
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Learners
63A. It is helpful to receive immediate feedback from LeaPos when I
need it (no latency time for waiting).
Experimental
4,5
0,59
96%
25
Page 34 of 349
D7.4 - Validation 4
OVT:
5.1
Pilot site
bit Austria
Pilot language
German
Operational Validation Topic
The live feedback helps learners improve their answers, so they can demonstrate their knowledge
more effectively
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Learners
64. The live feedback helps me improve my answers.
Experimental
4,1
0,93
76%
25
Learners
65. The live feedback reminds me to include extra information in my
answer that I had forgotten to include originally.
Experimental
4,1
0,83
76%
25
Learners
66. The live feedback helps me demonstrate my knowledge more
effectively.
Experimental
4,3
0,7
80%
25
Formative results with respect to validation indicator
Stakeholder type
Results
Learners
“The live feedback provides interesting information for me”
OVT:
5.1
Pilot site
Sofia
University
Pilot language
Bulgarian
Operational Validation Topic
The live feedback helps learners improve their answers, so they can demonstrate their knowledge
more effectively
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Learners
64. The live feedback helps me improve my answers.
Experimental
4,4
0,71
88%
25
Learners
65. The live feedback reminds me to include extra information in my
answer that I had forgotten to include originally.
Experimental
4,5
0,59
96%
25
Page 35 of 349
D7.4 - Validation 4
Learners
66. The live feedback helps me demonstrate my knowledge more
effectively.
Experimental
4,1
0,73
80%
25
Formative results with respect to validation indicator
Stakeholder type
Results
Learners
“It‟s a plus that you can get a score for your performance right away.”
“It is stimulating to be able to see how to improve your answer and than to check up if this works immediately.”
“It‟s nice to see that you are “in the green sector”.
Learners
“You can feel the system “guides” you to improve your answer”
OVT:
6.1
Pilot site
bit Austria
Pilot language
German
Operational Validation Topic
The direct feedback provided by the system encourages learners to undertake further study to address
gaps in their coverage.
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Learners
18. Using LeaPos increases my curiosity about the learning topic.
Experimental
3,7
1,12
64%
25
Learners
20. Using the system motivates me to explore the learning topic more
fully.
Experimental
3,8
0,94
68%
25
Learners
22. I am eager to explore different things with LeaPos.
Experimental
3,8
0,94
68%
25
Formative results with respect to validation indicator
Stakeholder type
Results
Learners
“I used the service during the times my concentration was not good to improve my learning capacity”
Page 36 of 349
D7.4 - Validation 4
OVT:
6.1
Pilot site
Sofia
University
Pilot language
Bulgarian
Operational Validation Topic
The direct feedback provided by the system encourages learners to undertake further study to address
gaps in their coverage.
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Learners
18. Using LeaPos increases my curiosity about the learning topic.
Experimental
4,0
0,76
72%
25
Learners
20. Using the system motivates me to explore the learning topic more
fully.
Experimental
4,1
1,04
68%
25
Learners
22. I am eager to explore different things with LeaPos.
Experimental
4,1
0,78
76%
25
Formative results with respect to validation indicator
Stakeholder type
Results
Learners
“It took me more time to correct some of my answers, because I didn‟t only search for the section that would answer the
question, I continued reading further”
“The question for the XML data types was surprising for me. I tried to find more information, but there wasn‟t. It would have been
great if there were more documents on this topic.”
OVT:
7.1
Pilot site
bit Austria
Pilot language
German
Operational Validation Topic: There is a saving in institutional resources overall
Formative results with respect to validation indicator:
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Analysis of consumed time, Final interview with the Branch office manager bit Vienna
Results:
Page 37 of 349
D7.4 - Validation 4
The following analysis has been made based on the existing IT-Basics and ECDL education project we are delivering for unemployed learners at
bit Vienna.
The following tasks and resources are needed to implement the LeaPos Service in our trainings environment:
Management Overhead: 5 days (once to implement a new tool)
Based on the experience of integration of new tools (e-Learning tools, Testing tools) an management overhead for planning and informing
employees involved is required (e.g.: Explain the idea and goals to the responsible employees in the course management, Planning the
technical implementation and internal testing, Scheduling events for training of the tutors, …)
Technical Implementation of the LeaPos Service: 2 days (once overall)
This task has to be done once (Server Setup, Installation of the LeaPos Service, Initial User Management).
The implementation can be done by our existing technical stuff during “non-peak times” and doesn‟t cause additional costs in the bit
environment.
Training for the tutors: 2-3 hours (once for each tutor, groups of between 6 and 8 tutors)
To enable and motivate the tutors to use the LeaPos Service face-to-face training is required. The primary goal of this training is to explain
the ideas of the language technology approach and how to interpret and use the LeaPos Service results.
Also guidelines for using the service in combination with other existing tools in the bit training environments are delivered during this
training. The explanation of the user interface and the handling of the software will take only 10 or 15 minutes, because the user interface
is easy to understand.
We are able to integrate this training in the existing regular scheduled events for our tutors (combination of information transfer and social
event).
Establishing the initial questionnaire: 2-3 days (once per course)
To improve the existing questionnaires, which are used in the bit group, is already defined as regular process.
To integrate the LeaPos Service for the bit courses the existing questionnaires can be adjusted step by step during the this regular
process. In this case no additional resources for this task are required.
As additional workload the graded answers for the questionnaire have to be collected and graded, which will take about up to one day.
Uploading the initial questionnaire and the learning materials (continuous process)
The LeaPos Service is able to upload all of the important file types which are used for training materials, so that our tutors are able to
Page 38 of 349
D7.4 - Validation 4
upload the materials themselves during their normal working times.
To improve the results of the LeaPos Service the tutors should annotate the learning materials. These annotations are used by the service
to provide appropriate training materials. Because of that the annotations done by one tutor are also useful for the other tutors to get
familiar with new training materials. This benefit results in an overall reduced time for integrating new training materials in our environment.
Guiding the learners: 1 or 2 hour(s) (once per each learner for all courses)
To ensure that the learners will use the LeaPos Service during the first week of the whole education path a guided introduction into the
idea and goals of the LeaPos Service and the use of language technologies for learning takes place. For learners with no IT experience
the additional time of 1 hour is calculated to get familiar with the usage of the interface.
Tutor resources required during the course
Tutor resources currently used in the bit environment
There is one tutor responsible to support a group of 8 up to 12 learners over a period of 6 weeks (whole education period defined in this
project per each learner).
The tutors are responsible for delivering a few hours face to face training per week and providing individualized support for each learner
during his learning process (e.g.: defining next steps in learning, providing training materials, answering questions, …).
Opportunity provided by the LeaPos Service and LTfLL tools
We are establishing educational projects in this area which are based on many self-learning components. This approach is important to
reach two main goals:
- Efficiency of learning (the learner has to pass the exams at the end of the courses)
- Cost efficient learning tracks
The tutors are important for the positioning process and supporting the learners during this self-learning time slots. The LeaPos Services
enables the learners to perform a self-positioning task for continuing their learning process.
Using this benefit we will be able to schedule a limited time in the learning process without any tutor support (or centralised support with
remote tools like video conferencing), where the tutor can be working on different job activities. E.g. a solution could be to have a
combination of 6 hours guided learning with direct tutor support and 2 hours of full self-learning without tutor support.
Currently we are using about 8 tutors at the same time for 5 days a week. Reducing the tutor support time from 8 to 6 hours a day will
result in saving 2 working days per week.
Page 39 of 349
D7.4 - Validation 4
Conclusion:
For implementing the LeaPos Service for 5 different courses of the training path we will have to invest about 18 days of preparation time (5 days
for the management overhead, 2 days for the technical implementation, 1 day for the training of the tutors and 10 days for preparing the
questionnaire).
Using the approach of reducing the working time of tutor resources we will save 2 days per week. In this case after a period of about 9 weeks the
original investment will be compensated.
To add an additional course in LeaPos to the training path we will only have to invest about 2 days for establishing the questionnaire with the
graded answers and the training materials. So after the LeaPos Services has been established as standard tool for the training environment the
additional amount of investment for adding a course is really acceptable.
OVT:
7.1
Pilot site
IPP-BAS
Pilot language
Bulgarian
Operational Validation Topic: There is a saving in institutional resources overall
Formative results with respect to validation indicator:
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: considering average set up times within flexible time constraints (2 weeks)
Results: The students and the tutors had flexible times to perform their tasks, since the tasks were external to their present courses. In general,
students performed his task in one hour time, including consultations with relevant materials and live feedback. It took to the tutor between 10 and
15 minutes in average to assess and grade the answers of a student using the LeaPos Service.
OVT:
8.1
Pilot site
bit Graz
Pilot language
German
Operational Validation Topic
The service meets one or more institutional objectives.
Page 40 of 349
D7.4 - Validation 4
Formative results with respect to validation indicator:
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Interview with the Branch office manager bit Vienna, CEO bit Schulungscenter
Results:
An important institutional objective is to develop improved educational solutions for reaching high quality and cost efficient learning concepts.
Therefor the management of the bit group is interested in using tools like the LeaPos Service or other tools of the LTfLL project in the future.
Especially the LeaPos Service provides an additional approach of improving the learning task in our projects where unemployed people are
educated. But also for e-Learning solutions the service could improve the quality of the learning.
Page 41 of 349
D7.4 - Validation 4
OVT:
8.1
Pilot site
IPP-BAS
Pilot language
Bulgarian
Operational Validation Topic
The service meets one or more institutional objectives
Formative results with respect to validation indicator:
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Interview with teaching manager from IPP-BAS (n=1).
Results:
An institutional objective is to re-use as much as possible from the learning resources. The main problem (probably with any new system) is the
effort necessary to be invested to initially prepare the resources. Also, a clear mechanism to maintain them.
Another institutional objective is the coverage of the whole curriculum. Our courses in IT are of two types: mathematically oriented and software
oriented. The latter (with the exception of writing programs) are suitable for using LeaPos. Thus, at the moment we cover part of our curriculum,
but this is still useful and is worth it.
An important institutional objective is the cooperation among our tutors. The main usage will be in the individual work of the tutors with the
students, but if the service is widely accepted by our tutors, we could build a repository of questions to be shared by all of them. This would
improve the quality of the used questionnaires and there would be a better coverage over the learning topics.
Attracting more students is a major institutional objective. "I envisage the main role of the service as a way of attracting more students, thus –
increasing their number in our IT courses (especially in MA programs)".
OVT:
9.1
Pilot site
bit Austria
Pilot language
German
Operational Validation Topic
Users were motivated to continue to use the system after the end of the formal validation activities
Summative results with respect to validation indicator
Questionnaire
type
Experimental /
control group
Questionnaire no. & statement
Page 42 of 349
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
D7.4 - Validation 4
Tutors
21. I would recommend this system to other teachers to help them in
their teaching.
Experimental
3,8
0,96
50%
4
Tutors
22. I am eager to explore different things with LeaPos.
Experimental
4,0
0,00
100%
4
Tutors
29. I would like to use the service in my teaching after the pilot.
Experimental
4,3
0,58
75%
4
Tutors
30. If the service is available after the pilot, I will definitely use it in my
teaching.
Experimental
4,3
0,58
75%
4
Learners
21. I would recommend this system to others
Experimental
4,0
0,91
72%
25
Learners
22. I am eager to explore different things with LeaPos
Experimental
3,8
1,03
56%
25
Learners
29. I would like to use the service after the pilot.
Experimental
3,8
1,22
56%
25
Learners
30. If the service is available after the pilot, I will definitely use it
Experimental
3,9
0,95
64%
25
OVT:
9.1
Pilot site
IPP-BAS
Pilot language
Bulgarian
Operational Validation Topic
Users were motivated to continue to use the system after the end of the formal validation activities
Summative results with respect to validation indicator
Learners
21. I would recommend this system to others
Experimental
3,8
0,94
67%
12
Learners
22. I am eager to explore different things with LeaPos
Experimental
3,8
0,72
67%
12
Learners
29. I would like to use the service after the pilot.
Experimental
3,7
0,89
58%
12
Learners
30. If the service is available after the pilot, I will definitely use it
Experimental
3,6
1,08
50%
12
Page 43 of 349
D7.4 - Validation 4
OVT:
9.2
Pilot site
bit Austria
Pilot language
German
Operational Validation Topic
A high score was obtained in the generic questionnaires (based on UTAUT: likelihood of adoption).
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Generic questionnaire - learners
Results:
Descriptive Statistics - Learners
N
Mean
Std. Deviation
Effectiveness
24
3,51
,842
Efficiency
24
3,03
,858
Cognitive Load
21
3,10
,995
Usability
25
3,67
,712
Satisfaction
25
3,69
,805
Facilitating conditions
24
3,61
,996
Self-Efficacy
23
3,86
,771
Behavioural intention
24
3,81
1,020
BIT-MEDIA
25
3,58
,610
Valid N (listwise)
19
Page 44 of 349
D7.4 - Validation 4
OVT:
9.2
Pilot site
IPP-BAS
Pilot language
Bulgarian
Operational Validation Topic
A high score was obtained in the generic questionnaires (based on UTAUT: likelihood of adoption).
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Generic questionnaire - learners
Results:
Descriptive Statistics - Learners
N
Mean
Std. Deviation
Effectiveness
25
4,02
,536
Efficiency
25
4,27
,473
Cognitive load
25
3,76
1,091
Usability
25
4,24
,548
Satisfaction
25
4,11
,705
Facilitating conditions
25
4,32
,402
Self-Efficacy
25
4,12
,738
Behavioural intention
25
3,98
,884
IPP-BAS
25
4,14
,417
Valid N (listwise)
25
Page 45 of 349
D7.4 - Validation 4
OVT:
9.3
Pilot site
bit Austria
Pilot language
German
Operational Validation Topic
Tutors attending a dissemination workshop give high scores to the question 'how likely are you to consider
adopting the service in your own educational practice?
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutor
workshop
1. How likely are you to consider adopting LeaPos in your own
educational practice?
Experimental
4,3
0,55
60%
5
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
The LeaPos Service is helpful for preparing the feedback for the learner during the positioning phases of the course.
Tutors
If the management of a learning institution is going to implement the LeaPos service, the initial workload has to be divided to
all involved tutors. This approach will allow providing the LeaPos Service with qualitative questionnaires.
Page 46 of 349
D7.4 - Validation 4
Section 4: Results – validation activities informing future changes / enhancements to the system
VALIDATION
ACTIVITY
Pilot partner: bitmedia
Service language: German
Additional formative results (not associated with validation topics)
Alpha testing
The Knowledge Rich approach was only available for the English course (This issue was solved before the pilot).
Beta testing
The user interface is working well.
Learner focus group 1
The discussion with learner focus group brought up the following results:
“The most important feature of the LeaPos Service is the Live Feedback, which immediately provides real useful
information for the learners.”
“Some improvements for the navigation in the LeaPos Service could be made – over-all the tools is easy to use”.
“Explanations about both the result lists should be implemented in the service (e.g.: animated demo results with
explanations about how to interpret the results) – this would enable the learners to use the service without getting the
intro session.”
Tutor interviews
Manager Interview
“The testing of the LeaPos Service demonstrated the capacity of the tool to support tutors and learners.”
“The percentage value is not very important for the positioning task.“
“The two different results (phrases and concepts) should be integrated in one list, because the difference
between both results is only interesting for language experts.”
“If we used the LeaPos Service in the Internet Explorer the colours do not provide useful contrast (e.g.: question
text).”
“We are interested in the LeaPos Service as additional tool for supporting the learners and tutors”.
“The user interface should also be available in different languages (e.g.: German, Polish, Czech, …) based on
customer requirements.
“There are possibilities to implement the LeaPos Service in different learning scenarios (traditional, e-Learning
and combinations).”
Page 47 of 349
D7.4 - Validation 4
VALIDATION
ACTIVITY
Pilot partner: IPP-BAS
Service language: Bulgarian
Additional formative results (not associated with validation topics)
Alpha testing
Adding Bulgarian lexicalisations of concepts is not easy with the lemmatization of a morphologically rich language.
This problem was solved before the pilot.
Beta testing
The visualisation for simple text documents makes them difficult to read, because there is no text wrapping, and one
has to scroll from left to right and back to finish a paragraph.
Learner focus group 1
First, learners discussed whether they should use the system again or they should not use it again. Most of the
learners would use the system again and were even surprised that it is free. One of them proposed to translate the UI
for free in Bulgarian. The most important reasons why the focus group participants would use the system again were:
o The learning materials are uploaded to the system and ready to use (“You don‟t have to go to the library or
search for info on the net”).
o The learners appreciate that the system points the gaps in their knowledge very precisely. Live Feedback is
quick and the process of improvement is not postponed by the necessity to wait for the tutor‟s reaction.
o LeaPos suggests learning materials related to the specific topic, which makes the learning easier.
o The system stimulates the learning process, because it represents the learning process like a game that the
learners want to win (“I wanted to answer as many questions as possible with the highest score possible”).
Some of the learners wouldn't use the system again. The reasons for that were:
o The system does not look like the universally recognized model for learning management systems (Moodle).
This confused some of them, who are used to another type of scenario, where the knowledge evaluation
follows the learning process, and does not precede it.
o The interface is not intuitive and it takes time to get used to the “onion layer” model (the answer field is
wrapped in the Question field, which in turn is wrapped in the course field). When the student was asked if he
used the built-in help, he answered negatively.
Page 48 of 349
D7.4 - Validation 4
o
o
The type of answer the tutor/system expects has to be specified explicitly in the question or by length
restrictions in the answer field.
Currently there is no way to prevent cheating (plagiarism).
The discussion about the changes and enhancements to the software was focused primarily on the improvement of
the visualisation:
o There is no information what type of answer is expected from the learners (“Why I got less points when my
answer was correct? It is one word answer, but it is correct.”). It would be better for learners if they knew in
advance whether they have to answer to the question with one word or with a whole text.
o The system does not show a list of already answered questions as a model to follow.
o The link “Show relevant learning material” should be bigger or should look like a radio button, there is a
chance to miss it.
The learners discussed also the effect on their learning when they use the system. All of them said that the system
stimulates them to learn. Also the system helps them to improve their answers. Some of the students said that they
did not waste time when they were using the system, because there was no need to look for learning materials.
According to some of the learners, the system made them more curious, thus they explored all the uploaded
materials, even the ones which were not recommended for the given topic.
Learner focus group 2
(prioritisation of
enhancements)
Learners judged that the six most important areas for enhancement of the system (i.e. clusters) are:
1. The visualisation (rendering in web browser) for simple text format files needs to be improved (NB: Learning materials
for Bulgarian were in .txt format). The readability of the document is important and motivates learners to use the materials
offered by the system.
2. Currently, there is no way to keep history of the individual learning curve (for example, how many positive vs. how
many negative answers have been received): the points received for each variant of the answer should be stored. Making
improvement visible is stimulating.
3. The concepts are not ranked: important – not-that-important – details. This would help students understand which type
of knowledge is basic and which is extra.
4. There is no way to control the time it takes for the learner to answer. Optionally, the time to answer could be limited, or
points could be detracted, if time limit is exceeded.
5. The learner cannot monitor his own progress. A list of answered /unanswered questions, and a ratio between them
might be provided.
Page 49 of 349
D7.4 - Validation 4
6. The interface could be more attractive and easy to navigate through.
Learners judged that the most important single improvements that should be made to the system are:
1. One has to check if there is feedback from the tutor or to know beforehand when to expect that. It would be good to get
a notification when the tutor has provided his feedback.
2. Give some motivation for the content of the selected “List of Phrases”. They are extracted from answers that were
graded with the highest grade, but sometimes they duplicate in their content, or are not ready to be used without some
modification: “Sometimes it remains unclear why these phrases are recommended”.
3. Sometimes it is difficult to find the piece of information one needs to answer. Highlight the segment in the learning
material, that is relevant to the question. If the case is that only a part of a document is relevant to the question‟s topic, it
will be better if it‟s somehow marked.
4. Adding an element of play or even competition might prove to be stimulating for some users: “Instead of just answering
a bunch of questions, make people compete for time or give them some bonuses - you know what they say - even if it is
just a ribbon, still it is a reward. A funny animation or simply “Congratulations!” message to pop-up will do too.”
Tutor interviews
Major changes identified by tutors are:
“I think the system should provide an explanation for some of its decisions. Especially in regard to the list of
missing phrases and additional concepts.”
“At the moment the service cannot generalize over the various phrases, and I am sometimes lost in more or less
synonymous expressions.”
“The software should be used equally easy in various browsers (I could not use it with Internet Explorer)”
Manager Interview
„The system could only benefit from a spellchecker integration. In my courses spelling and grammar form an
integral part of the assessment.”
“The system has the potential to attract more students in IT area, especially in MA programs”.
“I also know about the other service of the project (6.1) and I would like to use the ontology in both of them. This
will increase the quality of the knowledge structure in our programs”
“In Bulgaria the number of students is one important element of the evaluation of the universities. Using such
service in the appropriate way will increase the number of the students”
“The students might be attracted to using of such services if they have initial access to a large number of
questions, which requires a lot of initial work by LeaPos team and the tutors respectively.”
Page 50 of 349
D7.4 - Validation 4
Page 51 of 349
D7.4 - Validation 4
Section 5: Results – validation activities informing transferability, exploitation and barriers to adoption
VALIDATION
ACTIVITY
Partner(s) involved: bitmedia
Service language: German
Additional formative results (not associated with validation topics)
Alpha testing
None
Learner focus group 1
“Explanations about both the result lists should be implemented in the service (e.g.: animated demo results with
explanations about who to interpret the results) – this would enable the learners to use the service without getting the
intro session.”
Tutor interviews
“The two different results (phrases and concepts) should be integrated in one list, because the difference between
both results is only interesting for language experts.”
“The user interface should also be available in different languages (e.g.: German, Polish, Czech, …) based on
customer requirements.
Manager Interview
VALIDATION
ACTIVITY
Partner(s) involved: IPP-BAS
Service language: Bulgarian
Additional formative results (not associated with validation topics)
Alpha testing
Major issues encountered in transferring LeaPos to Bulgarian:
o Adding lexicalisations (which express the ontology concepts in a natural language – in this case - Bulgarian)
has to be mediated by lemmatization for languages, rich in inflection. This issue was solved for the pilot by
FLSS team.
o The available learning objects in Bulgarian did not cover the sub-domain of the BITMEDIA questionnaire.
Thus, a new questionnaire was created on one of the present sub-domains. It took roughly one hour to
introduce a distinct question. The time includes checks within the repository and ontology.
Page 52 of 349
D7.4 - Validation 4
Learner focus group 1
Live feedback cannot be used when the test does not presuppose a textual representation of knowledge (for
example, solutions to mathematical problems or writing software programmes).
It was noted that the interface is in English, while the pilot was in Bulgarian. For considering a wider usage in
Bulgaria, synchronization of the interface with the pilot language is required.
The system works on question-based level (one question at a time). It would be useful to explore the connections
among several related questions.
Tutor interviews
Manager Interview
All tutors agreed that the immediate feedback will attract learners and thus it is very likely that they will be inclined to
work with LeaPos.
It is necessary to start with a hands-on type activity and train the learners how to get the best of the live-feedback,
even when the output is not very precise.
Another problem, at least at the beginning, will be the need to create a number of tests with many different types of
questions in order for the learners to be able to use the system by themselves.
The system cannot provide feedback for part of the courses (mathematical tasks, software code, etc.)
The ontology service of the system is very useful when combined with 6.1 semantic search for suggesting relevant
learning materials
A repository of questions lacks, which to serve as a core resource for various tasks and sub-domains within IT, and to
be shared by the tutors
Transferability questionnaire: Relevance of the service in other pedagogic settings
Pedagogic setting
Reason(s)
Pedagogic settings for which the service would be
suitable:
Setting 1: Using LeaPos in revising for exams in
Universities
The ranking of the answers helps the students to get oriented about their
knowledge with respect to the curriculum.
Setting 2: Using LeaPos for getting supplementary
information on the topic in Universities
The suggestion of learning materials on the topic when the answer is not
satisfying helps the students to improve and widen their basic knowledge.
Page 53 of 349
D7.4 - Validation 4
Pedagogic setting
Setting 3: Using LeaPos for assessing and grading
students‟ answers on a topic.
Reason(s)
Tutors save time in LeaPos and have at disposal important information, such as
matched and missing concepts/phrases as well as the automatic live feedback.
Pedagogic settings for which the service would be less
suitable:
Setting 1: In a setting where the input includes formulas and
software code.
At the moment LeaPos can assess only textual input.
Page 54 of 349
D7.4 - Validation 4
Section 6: Conclusions
Validation Topics
OVT
Operational Validation Topic
Validated
unconditionally
Validated with
qualifications*
PVT1: Verification of accuracy of NLP tools
OVT1.1
Absolute value of score The tutors/experts find
that When the score given by the system is
compared with the score given by the tutor, the
difference between the two values is small.
BIT-MEDIA
(German)
IPP-BAS
(Bulgarian)
OVT1.2
Relative value of score The tutors/experts find
that When a learner has improved his/her
answer, as judged by the tutor, an increase in
the live feedback score is observed consistently.
BIT-MEDIA
(German)
OVT1.3
KP feedback: The tutors/experts find that A
high proportion of the phrases in the two
columns (positive, missing) are judged as being
correct feedback.
BIT-MEDIA
(German)
IPP-BAS
(Bulgarian)
OVT1.4
KR feedback: The tutors/experts find that A
high proportion of the concepts in the two
columns (common, missing, additional) are
judged as being correct feedback.
BIT-MEDIA
(German)
IPP-BAS
(Bulgarian)
PVT2: Tutor efficiency
OVT2.1
Tutors spend less time preparing final feedback
for learners and grading compared with
traditional means.
IPP-BAS
(Bulgarian)
Page 55 of 349
BIT-MEDIA
(German)
Not
validated
Qualifications to validation
D7.4 - Validation 4
OVT
Operational Validation Topic
OVT2.2
It is easy (there is less cognitive load) for tutors
to provide feedback and grading using LeaPos
Validated
unconditionally
Validated with
qualifications*
Not
validated
Qualifications to validation
BIT-MEDIA
(German)
IPP-BAS
(Bulgarian)
PVT3: Quality and consistency of (semi-)
automatic feedback OR information returned
by the system
OVT3.1
Tutors perceive that the feedback received from
the system helps them prepare feedback for
learners.
BIT-MEDIA
(German)
IPP-BAS
(Bulgarian)
OVT3.2
Learners perceive that the live feedback
received from the system contributes to
informing their study activities.
BIT-MEDIA
(German)
SU (Bulgarian)
OVT3.3
Learners perceive that they receive useful
additional feedback, compared with traditional
means.
SU (Bulgarian)
BIT-MEDIA
(German)
OVT3.4
Learners perceive that the system can target
learning materials depending on their needs.
SU (Bulgarian)
BIT-MEDIA
(German)
PVT4: Making the educational process
transparent
Page 56 of 349
Sometimes it is difficult to
differentiate between the basic
and additional information that is
received via the system
D7.4 - Validation 4
OVT
Operational Validation Topic
Validated
unconditionally
OVT4.1
Tutors perceive that positioning is more effective
compared with traditional means because the
quality and quantity of the input to positioning is
improved
BIT-MEDIA
(German)
IPP-BAS
(Bulgarian)
OVT4.2
Tutors perceive that using LeaPos, learners
receive homogeneous feedback
IPP-BAS
(Bulgarian)
OVT4.3
Learners can receive feedback when they need
it.
Validated with
qualifications*
Not
validated
Qualifications to validation
BIT-MEDIA
(German)
Tutors need some time to adjust
their own grading system to the
one, provided by the system
BIT-MEDIA
(German)
The idea behind the usage of
phrases and concepts needs to
be made clearer to the students
BIT-MEDIA
(German)
SU (Bulgarian)
PVT5: Quality of educational output
OVT5.1
The live feedback helps learners improve their
answers, so they can demonstrate their
knowledge more effectively
BIT-MEDIA
(German)
SU (Bulgarian)
PVT6: Motivation for learning
OVT6.1
The direct feedback provided by the system
encourages learners to undertake further study
to address gaps in their coverage.
SU (Bulgarian)
PVT7: Organisational efficiency
OVT7.1
There is a saving in institutional resources
overall
BIT-MEDIA
(German)
PVT8: Relevance
Page 57 of 349
IPP-BAS (Bulgarian): insufficient
evidence
D7.4 - Validation 4
OVT
Operational Validation Topic
OVT8.1
The service meets one or more institutional
objectives
Validated
unconditionally
Validated with
qualifications*
Not
validated
Qualifications to validation
BIT-MEDIA
(German)
IPP-BAS
(Bulgarian)
PVT9: Likelihood of adoption
OVT9.1
Users were motivated to continue to use the
system after the end of the formal validation
activities
BIT-MEDIA
(German)
IPP-BAS
(Bulgarian)
OVT9.2
A high score was obtained in the generic
questionnaires (based on UTAUT: likelihood of
adoption by users).
IPP-BAS
(Bulgarian)
OVT9.3
Tutors attending a dissemination workshop give
high scores to the question 'how likely are you to
consider adopting the service in your own
educational practice?
BIT-MEDIA
(German)
BIT-MEDIA
(German)
Exploitation (SWOT Analysis)
The objective you are asked to consider is: "<service> (v1.5) will be adopted in pedagogic contexts beyond the end of the project".
Strengths
The strengths of the system (v1.5) that would be positive indicators for adoption are:
“Saving tutor time and costs of formative feedback and positioning”
Based on this benefit a reduced number of tutors are able to support a group of learners. This enables the learning provider
to save costs.
Page 58 of 349
D7.4 - Validation 4
“Targeting learning materials according to learner need (short thread functionality)”
The LeaPos Services offers available learning materials to the learner. Therefore the learner is able to proceed with his
learning tasks without tutors
“Feedback is immediately provided”
The learners are able to get the feedback from the LeaPos Service immediately without interaction from tutors and able to
continue their learning activities.
“Exciting to use, useful and motivating”
The learners are enjoying the functional user interface and are motivated by the lists of additional phrases and concepts to
follow up their learning activities.
“Support for tutors building a repository of targeted learning materials (short thread functionality)”
The LeaPos Services offers additional benefits for a group of tutors, which is supporting the same learners. There is a
central place to add and annotate learning materials for the tutors. This functionality improves the motivation for each tutor to
add materials to the LeaPos Service, because there is a direct benefit available for all of the tutors in the group.
Weaknesses
The weaknesses of the system (v1.5) that would be negative indicators for adoption are:
“It takes time to get oriented in the result (lists)”
The result lists of the LeaPos Service provides missing phrases and concepts that were not included in the learner's
answer, which seems to be a negative result for the learner. Therefore the learners have to be guided to use this
information as important information for their next learning activities.
“Two different lists are confusing for some learners”
Because for some learner the usage of two lists is confusing the widget-version enables the customization to use only one
of the lists.
“There are some incorrect results included in the feedback”
The main improvement strategy to avoid incorrect results is to add graded answers for the knowledge poor approach and to
update the concepts data for the knowledge rich approach.
From the point the learners recognized how to interpret the result lists, they were able ignore incorrect results in the live
feedback.
Page 59 of 349
D7.4 - Validation 4
Opportunities
The system has potential as follows:
LeaPos has the potential to be appropriate in many non-self-directed learning situations, which rely on limited content.
LeaPos can be used in any short answer situation where formative feedback and/or feedback is required. Possible uses of
LeaPos can be found in primary schools through to assessing lifelong learning (as in bitmedia) or short courses in
Continuous Professional Development.
There is interest at bitmedia in extending LeaPos to other languages (e.g. Czech) and countries where bitmedia has a
presence
With some enhancement, LeaPos could be used for progress tracking.
Threats
“Implementing a new technology” Educational companies and institutes may be concerned with the required activities for
implementing a language technology based tool in their environment because they are not familiar with this technology.
“Plagiarism when answering the questions”
The tutors may be concerned about the possibility that the learners are providing the same or nearly the same answers for
the questions.
Lack of interoperability with corporate Learning Management Systems
An “Open Mind” for new technologies is required”
Users may be concerned about the language technology based concept of the LeaPos Service and not be prepared to try it.
Overall conclusion regarding the likelihood of adoption of LeaPos Version 1.5:
LeaPos performed strongly in all aspects of the validation, and both pilot institutions expressed their interest in continuing to use LeaPos.
Although LeaPos would benefit from improvements to the usability, it is already useful in real learning situations. LeaPos is very versatile in its
potential educational contexts of use, being appropriate for any short answer situation where formative feedback and/or feedback is required, from
primary education through to lifelong learning situations. Therefore we conclude that with effective marketing, LeaPos could become widely
adopted in real educational settings.
LeaPos offers additional functionality for the education company bitmedia to support their learners in combination with reducing the costs for
tutors. Based on these opportunities the management of the bit group will continue involving additional tutors and learners in using the LeaPos
service.
The system is very likely to be adopted by both stakeholder groups – tutors and learners. Since a manager was interviewed from IPP-BAS, there
is a confirmation that the management at IPP-BAS would like to continue working with LeaPos for various projects and in the teaching courses. In
Page 60 of 349
D7.4 - Validation 4
spite of the users‟ requirements on larger ontology and more learning materials, tutors like the system, because it ensures immediate feedback to
the learners, which 1. gives the tutors some time for reaction and 2. helps the tutors in taking the right grading decision, and 3. makes the learners
eager to see themselves high scoring in “the green sector”, and to explore the suggested learning materials.
LeaPos is more appropriate for non-self-directed learning scenarios, which rely on limited content. However, there is a possibility for the repository
to be further enriched with respect to the tasks and the domain.
Most important actions to promote adoption of LeaPos:
Technical
Improve the usability to make the system more intuitive, including the on-line help
Continue to improve the accuracy of the language technologies and the Knowledge Poor feedback in particular
“Support for different languages”
The adoption of the user interface for localized languages should be simplified (e.g. xml configuration file) to reduce the required time for
implementing the LeaPos Service in different countries.
Investigate possible enhancement to incorporate progress tracking
Domain-specific data
“Build standard sets of questionnaires and answers and associated learning materials”
The availability of existing questionnaires and answers will speed up the implementing of the LeaPos Service for different learning providers
and learning institutes.
Commercial companies could pick up this approach as business model.
“Provide updated concepts”
The availability of improved concepts for different courses (knowledge areas) enables the LeaPos customers to implement the Knowledge
Rich Approach without additional effort.
The required activities could be achieved by research staff in different universities, because the work is time-consuming and typical
commercial companies don‟t have access to language technology experts.
Exploitation
Continue to disseminate LeaPos in research and educational conferences. Consider targeting commercial companies at these conferences who could
launch LeaPos in particular domains.
Set up a user group to share experiences and assist in dissemination
Page 61 of 349
D7.4 - Validation 4
Section 7 – Road map
Based on the results and conclusions from validation, the LTfLL team has agreed that the following are the most important future
enhancements to the system in order to meet stakeholder requirements:
Most important:
1. Adding explanations to the result lists to assist the learners to recognize the benefits of the feedback.
2. Highlighting the important phrases and concepts in the result lists of the feedback.
3. Improvements of user interface (included help system)
4. Explain the benefits of the phrases- and concepts list in the live feedback to enable the stakeholders to choice the only one or both result
lists based on their requirements.
Based on the results and conclusions from validation, the LTfLL team has agreed that the following are the most important changes to
the current scenario(s) of use in order to meet stakeholder requirements:
Most important:
1. The learners should be guided to use the LeaPos Service more than one time during the course to get feedback for their current learning
topics.
2. The importance of using the motivation benefit during the course should be added to the scenario.
3. The functionality of direct links to relevant parts of learning materials should be added to the scenario.
Based on the results and conclusions from validation, the LTfLL team has agreed that the following are possible additional educational
contexts for future deployment:
Most important:
1. After the LeaPos Service has been implemented and used in the classroom environment the usage can be passed over to distance
learning scenarios.
2. The LeaPos Service could be used for pre-assessments of classroom face-to-face trainings to provide an overview of the learners
knowledge to trainer some days before the first day of training.
3. The LeaPos Service concept has the potential to act as marketing tool for learning materials (books, e-learning or traditional classroom
learning) as side effect of the live feedback lists of phrases and concepts -
Page 62 of 349
D7.4 - Validation 4
a commercial adaption with included promotion of learning offers is possible.
Based on the results and conclusions from validation, the LTfLL team has agreed that the following are the most important issues for
future technical research to enable deployment of language technologies in educational contexts:
Most important:
1. Adding support for different languages in the user interface – the user interface should automatically switch to the preferred language of
the user based on his personal settings in the web browser.
2. Minimize the effort for adding questionnaires and concepts to the LeaPos service.
3. Adding Export- and Import functionality for the system data (questionnaires, answers and learning materials for one course) to allow an
easy exchange of existing data
(Building the data for LeaPos for specific courses could be triggered as business model similar to e-learning content).
Roadmap - validation activities
Further validation planned for beyond the end of the project:
Claim (OVT): The LeaPos Service provides benefits for the ECDL (European Computer Driving License) education in the School environment in
Austria
Methodology:
Transferring the LeaPos Service to the Secondary School Environment in Austria as a pilot.
Objective (OVT): Adopting the LeaPos Service for new course without using language technology expert support
Methodology:
Adding a new course to the LeaPos Service with using this course as pilot at bitmedia.
Page 63 of 349
D7.4 - Validation 4
Appendix B.3 Validation Reporting Template for WP4.2 (UNIMAN and OUNL)
Section 1: Functionality implemented in Version 1.5 and alpha / beta-testing record
Brief description of functionality
Version
number of
unit
Changes from Version 1.0
1. a list of concepts, which is an alternative view of the data shown in a conceptogram;
different colours show which concepts came from which posting in the RSS feed
v1.5
did not exist in V1.0
2. a clearer combined conceptogram
v1.5
font, size, and colour changes
3. a multiple-merge conceptogram which combines three or more single conceptograms
v1.5
did not exist in V1.0
4. a link from concepts to source, i.e., a particular blog posting where the concept was
written about
v1.5
did not exist in V1.0
5. extended context specific Help pages
v1.5
improved help
Many changes were made in v1.5 and documented in the LTfLL Deliverable 4.3,
Appendix B.
The major changes are listed below.
Alpha-testing.
Pilot site and language
UNIMAN
Date of completion of alpha testing:
7 October 2010
Who performed the alpha testing?
Alisdair Smithies, Isobel Braidman
Page 64 of 349
D7.4 - Validation 4
Page 65 of 349
D7.4 - Validation 4
Pilot site and language
OUNL (Dutch)
Date of completion of alpha testing:
7 October 2010; 14 October 2010
Who performed the alpha testing?
Adriana Berlanga, Jan Hensgens
Beta-testing
Pilot site and language: UNIMAN, English
Has stakeholder validation taken place using the service embedded in Elgg (Yes/No/Partially):
If ‘No’ or ‘Partially’, give reasons:
Yes
beta-testing performed by:
Michelle Keown (Tutor), Tristan Pocock (Tutor), Peter Yeates (PhD student)
beta testing environment (stand-alone service / integrated into Elgg):
HANDOVER DATE:
22 October 2010
(Date of handover of software v.1.5 for validation)
Pilot site and language: OUNL, Dutch
Has stakeholder validation taken place using the service embedded in Elgg (Yes/No/Partially):
If ‘No’ or ‘Partially’, give reasons: The widget did not work at that time in Elgg
beta-testing performed by:
Jannes Eshuis (Tutor), Theo Verheggen (tutor)
beta testing environment (stand-alone service / integrated into Elgg):
Stand-alone service
Page 66 of 349
No
D7.4 - Validation 4
HANDOVER DATE:
8 November 2010
(Date of handover of software v.1.5 for validation)
Page 67 of 349
D7.4 - Validation 4
Section 2: Validation Pilot Overview
NB Information about pilot sites, courses and participants has been transferred to Appendix A.3
Pilot task
Pilot site: UNIMAN
Pilot language: English
What is the pilot task for tutors and how do they interact with the system?
Tutors were provided with access to the service.
• Tutors collaborated to produce a blog (representative of the Intended Learning Outcomes for a specific Problem Based Learning Case).
• Tutors accessed CONSPECT and produced a reference model from their blog.
• Tutors then viewed the outputs of CONSPECT‟s analysis.
• Tutor reference model conceptogram and list of concepts
• The outputs of analysis of an individual student‟s blog
• Student group reference model
• Output of the student group reference model compared with tutor reference model
• Output of an individual compared with tutor reference model
• Check the list of concepts and draw conclusions about the progress of the student
• Provide comments or feedback on the outputs to the pilot facilitator, who forwards any necessary actions to the students.
What do the tutors produce as outputs? Are the outputs marked?
Yes, conceptograms and concept lists. They were not marked
How long does the pilot task last, from the tutors starting the task to their final involvement with the software?
3 weeks
How do tutors/student facilitators interact with the learners and the system?
They review the conceptograms and concept lists.
Page 68 of 349
D7.4 - Validation 4
Describe any manual intervention of the LTfLL team in the pilot:
UNIMAN provided the tutors and students with OpenID accounts and instructions for how to access and use the system.
Pilot site: UNIMAN
Pilot language: English
This box describes the pilot task for students.
What is the pilot task for students and how do they interact with the system?
Students study a PBL case, “Helen”, from the Mind and Movement module, and were asked to keep a blog of their study on Elgg. They used
CONSPECT to produce reference models, create comparison between them and draw conclusions.
They were provided with the following instructions:
Use CONSPECT to:
Create and view your reference model
Check the concept list, and the graph
Make annotations about your thoughts
Compare your reference model with other individual students
Combine your reference model vs. the tutor model
Combine your reference model vs. the group model
Make you reference models public, and compare it with other public reference model
What do the students produce as outputs? Are the outputs marked?
Conceptograms. They were not marked, but were viewed by tutors, who were asked for their feedback.
.How long does the pilot task last, from the students starting the task to their final involvement with the software?
3 weeks
How do tutors interact with the learners and the system?
Module tutors were presented with the outputs from the system and invited to provide feedback to students, via email.
Describe any manual intervention of the LTfLL team in the pilot:
OpenID accounts were provided, along with a user guide of the tool.
Page 69 of 349
D7.4 - Validation 4
Pilot site: OUNL
Pilot language: Dutch
There were pilot workshops for learners and students. This box describes the pilot task for tutors.
What is the pilot task for tutors and how do they interact with the system?
Tutors were presented with a case, and were asked to use CONSPECT to produce reference models, create comparison between them and draw
conclusions.
They were provided with the following case:
Imagine that you have ask your students to keep a blog, in which they write study tasks
You want to see how they are doing, checking if the concepts they are mentioning in their texts are those they are required to learn in the
course, individually and as a group
… You decide to use CONSPECT and:
Create at least 2 reference models
Tutor or “book” ref. model
Student or group ref. model
Compare the reference models
Combine the reference models you‟ve created
Check the list of concepts and draw conclusions about the progress of the student
Make your reference models public, and compare it with other public reference model
To perform these tasks tutors were provided with examples of tutor‟s blogs, student‟s blog and group blogs. Tutors had provided earlier these
materials taken from students‟ answers to specific assignments, tutor‟s materials, and workbook.
What do the tutors produce as outputs? Are the outputs marked?
Conceptograms. They were not marked
How long does the pilot task last, from the tutors starting the task to their final involvement with the software?
2 hours, without the preparation time some of them invest on producing/getting the materials for the blogs.
How do tutors interact with the learners and the system?
Yes, they have to use the system to perform the tasks. They create conceptograms and review them. They also make comparison between the
Page 70 of 349
D7.4 - Validation 4
different outputs of the system and discussed between them how they found the information the system provides as well as the interaction with the
system.
Describe any manual intervention of the LTfLL team in the pilot:
OUNL created the blogs (using the material provided by the tutors). OpenID accounts were provided, a user guide of the tool, and a list of LSA
parameters that will work with the provided examples.
Pilot site: OUNL
Pilot language: Dutch
There were pilot workshops for learners and students. This box describes the pilot task for students.
What is the pilot task for students and how do they interact with the system?
Students were presented with a case, and were asked to use CONSPECT to produce reference models, create comparison between them and
draw conclusions.
They were provided with the following case:
You‟ve been asked to write an essay, which answers four questions related to the first part of the course about “De vele vormen van
selectie”
You want to check if the concepts you are mentioning in your text are relevant for the assignment
… You decide to use CONSPECT and:
Create your reference model
Check the concepts, and the graph
Make annotations about your thoughts
Compare your reference model
Combine your reference model vs. the tutor model
Combine your reference model vs. the group model
Make you reference models public, and compare it with other public reference model
Before the pilot session students were asked write an essay that answered 4 specific questions related to the first part of their course. This was the
input they would use during the activity. There were also available in the system a concept map created using a tutor‟s blog (which has been
created using the tutor‟s answer to the assignment), and a group concept map, which was created aggregating the input received from the
students (essays from students) in to a blog entry.
What do the students produce as outputs? Are the outputs marked?
Page 71 of 349
D7.4 - Validation 4
Conceptograms. They were not marked but during the session these conceptograms were used by the tutor (who was present in the session) to
discuss the input with the student.
How long does the pilot task last, from the students starting the task to their final involvement with the software?
2 hours, without the preparation time they invest on producing their essay.
How do tutors interact with the learners and the system?
Yes, a tutor was present in the sessions and he gave some feedback to students considering the output from the system.
Describe any manual intervention of the LTfLL team in the pilot:
OUNL created the blogs (using the material provided by the students and the tutor). OpenID accounts were provided, a user guide of the tool, and
a list of LSA parameters that will work with the provided examples.
Page 72 of 349
D7.4 - Validation 4
Experiments
Name of experiment: Validation of the outcomes of the tool
Objective(s):
The objective of this experiment was to ask tutors to check if the output CONSPECT provides includes the concepts on which they would provide
feedback, and if the tool has calculated correctly a high proportion of concepts as important (as requested in OVT1.1 and OVT1.2).
Details:
Tutors were asked to provide materials to create reference models. They provided course material, students‟ texts (n=5) and a digitalized book in
two subjects: Evolutionary Psychology and Cultural Psychology.
With this information conceptograms were created: reference model, group model and student‟s models. Comparisons were made between them,
and the result was presented and discussed with tutors. After, tutors had to indicate if the conceptogram covered sufficiently well. Tutors
mentioned that conceptograms include relevant concepts, and the relations seem quite good, but there were also important concepts that were
missing in the map. There were also concerns about the level of detail that the conceptogram shows.
Tutors also find it difficult to interpret the conceptograms. They mention as well that is difficult to understand why some concepts that do not
appear in the text do appear in the conceptogram. On the one hand this can be useful to identify new concepts and relations, on the other this
could be confusing and misleading for learners. One of the tutors mentioned that he would like to provide a list of “most relevant concepts” in
advance and that from there the tool will build the conceptogram.
Page 73 of 349
D7.4 - Validation 4
Section 3: Results - validation/verification of Validation Topics
OVT:
OVT1.1
Pilot site
UNIMAN
Pilot language
English
Operational Validation Topic
Tutors have assessed that a high percentage of the concepts identified by CONSPECT are relevant to the task
the learners have undertaken.
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology:
Tutors (n=2) assessed the relevance of concepts reported by CONSPECT to the learning task from 5 sets of results.
Results:
An average of 32% of terms reported by CONSPECT was deemed relevant to the learning task. 68% of terms were considered irrelevant.
OVT:
OVT1.2
Pilot site
UNIMAN
Pilot language
English
Operational Validation Topic
Tutors have assessed that CONSPECT had identified most of the concepts on which they would provide
feedback
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology:
Tutors (n=2) assessed the number of concepts reported by concepts that were relevant to the materials the students had produced in 5 sets of
results.
Results:
Tutors assessed that CONSPECT identified an average (MEAN=8.4 topics) 23% of topics from the student materials that were suitable to provide
feedback on.
Page 74 of 349
D7.4 - Validation 4
Questionnaire
type
Questionnaire no. & statement
Experimental /
control group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
Q35 I am able to assess that CONSPECT had identified most of the
concepts on which they would provide feedback
Experimental
1.8
0.84
0%
5
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
Tutors felt the relevance of concepts provided in the present version of CONSPECT was not of sufficient “depth or accuracy”
to allow them to provide detailed feedback. Stemming of words was seen as a key weakness of the service –
“in the medical domain the specificities of technical words can have vastly different implications, this is not accurate enough,
I‟d have to guess really”
“I perceive that the concept of a package like conspect could be extremely useful in teaching - in essence in allowing very
rapid comparison of student responses with a model answer. The key to such a process, though, critically depends on the
trustworthiness of the output and the degree to which the output is meaningful.”
“Most important: the software cannot be used for students at this level without providing some means of helping tutors and
students to analyse the depth of their understanding, rather than the superficialities presently included”
OVT:
OVT1.2
Pilot site
OUNL
Pilot language
Dutch
Operational Validation Topic
Tutors have calculated that CONSPECT had identified most of the concepts on which they would provide
feedback.
Formative results with respect to validation indicator
Stakeholder type
Results
Page 75 of 349
D7.4 - Validation 4
Tutors
OVT:
OVT1.3
Tutor responses were negative about the relevance of concepts identified by CONSPECT but positive about the method it
uses. Identified concepts were considered irrelevant or too general to be of sufficient value in assessment of a learner‟s
understanding in a task:
“I don't think that the information represented in the bubbles in the diagram is all that representative of the meaning in the text
either in terms of the balance of topical coverage, or the organisation of the concepts relative to each other.”
“I do miss some important terms while others are superfluous”
Pilot site
UNIMAN
Pilot language
English
Operational Validation Topic
Tutors have assessed that CONSPECT has provided appropriate linkages between concepts for most of the
relevant conceptual relations.
Questionnaire
type
Questionnaire no. & statement
Experimental /
control group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
Q36 I am able to assess that CONSPECT has provided appropriate
linkages between concepts for most of the relevant conceptual
relations.
Experimental
1.4
0.89
0%
5
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
The tutors felt that CONSPECT‟s ability to provide appropriate linkages was not suitably effective.
“It needs to expose interesting/important relationships between topics”
“The tutor blog does contain expected details of the cellular basis for memory and emotional responses, together with the way
in which alcohol consumption might affect this. The best that the conceptogram could do was to relate alcohol with
consumption and alcoholism.”
Page 76 of 349
D7.4 - Validation 4
OVT:
OVT1.4
Pilot site
UNIMAN
Pilot language
English
Operational Validation Topic
Tutors have assessed that most of the concepts have been correctly categorised as important.
Questionnaire
type
Questionnaire no. & statement
Experimental /
control roup
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
Q37 I am able to assess that most of the key concepts have been
correctly categorised as important
Experimental
2.0
1.22
20%
5
OVT:
OVT1.4
Pilot site
OUNL
Pilot language
Dutch
Operational Validation Topic
Tutors have calculated that a high proportion of concepts have been correctly categorised as important.
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors (interview,
n=2)
“Some terminology, that I really would have expected to turn up in the concept maps, it did not turn up, even though these are
terms that are repeated quite often through the design maps. So the level of depth required in a University context is not well
covered by the maps”
OVT:
OVT2.1
Pilot site
UNIMAN
Pilot language
English
Operational Validation Topic
Using CONSPECT, tutors spend less time preparing feedback than without the system
Questionnaire
type
Questionnaire no. & statement
Experimental /
control group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
Q7 It takes less time to complete my teaching tasks using CONSPECT
Experimental
2.4
1.14
20%
5
Page 77 of 349
D7.4 - Validation 4
than without the system.
Tutors
Q8 Using CONSPECT enables me to work more quickly than without
the system.
Experimental
2.2
1.30
20%
5
Tutors
Q9 I do not wait too long before receiving the requested information.
Experimental
2.4
1.10
0%
5
Tutors
Q10 CONSPECT provides me with the requested information when I
require it (i.e. at the right time in my work activities).
Experimental
2.2
1.10
0%
5
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
Tutors (n=2) commented that CONSPECT took a long time to process the blogs and they often thought the service had
“hung”. There were concerns that students would not engage with the service if this was the norm.
“…it just stopped and nothing happened! I didn‟t know what to do. It could have shown a pop up box showing that it was
processing at the very least”
Tutors
Tutor (n=1) was positive about the service‟s potential to assess large numbers of student blogs in a short time, and provide a
prompt indication of who was not engaging in the learning task.
“This could be really useful to show, on a short turnaround, how many of the students are actually understanding the things
we‟ve asked them, I guess, though I‟m not sure how easy it would be to gauge that from the interfaces you‟ve shown me… is
there a summary view of all students anywhere?”
OVT:
OVT2.2
Pilot site
UNIMAN
Pilot language
English
Operational Validation Topic
It is easier (there is less cognitive load) for tutors to provide feedback using CONSPECT compared with not using
the system
Questionnaire
type
Questionnaire no. & statement
Experimental /
control group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
Q11a Please rank on a 5-point scale the mental effort (1 = very low
mental effort; 5 = very high mental effort) you invested to accomplish
Experimental
3.6
1.67
60%
5
Page 78 of 349
D7.4 - Validation 4
teaching tasks using CONSPECT.
Q11b Overall, using the system requires significantly less mental effort
to complete my teaching tasks than when manually assessing learners‟
conceptual coverage.
OVT:
OVT3.1
Pilot site
UNIMAN
Pilot language
English
Experimental
2.4
0.89
0%
5
Operational Validation Topic
Tutors judge that CONSPECT shows correctly the conceptual coverage of a given topic .
Questionnaire
type
Questionnaire no. & statement
Experimental /
control group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
Q41 CONSPECT shows correctly the conceptual coverage of a given
topic.
Experimental
2.2
1.30
20%
5
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
Tutor responses were negative about the relevance of concepts identified by CONSPECT but positive about the method it
uses. Identified concepts were considered irrelevant or too general to be of sufficient value in assessment of a learner‟s
understanding in a task:
“I don't think that the information represented in the bubbles in the diagram are all that representative of the meaning in the
text either in terms of the balance of topical coverage, or the organisation of the concepts relative to each other.”
“The output is extremely reductionist and gives very little idea of the level of understanding or complexity behind each of the
concepts shown.”
“A major problem with this version CONSPECT, however, is that it does not provide sufficient depth to be of use, as the
concepts that it identifies are far too general and superficial.”
“Results don't expose any 'meaning' from the inputs”
Page 79 of 349
D7.4 - Validation 4
“A major concern is that the general nature of the concepts identified gives the students the wrong message that this level is
adequate preparation for module assessments. They are not adequate for this purpose and this makes tutor‟s jobs more
difficult than had they been giving feedback without CONSPECT”
“Given the general nature of the concepts identified I suggest it could be more useful to students either at a more basic level in
their education, for example the “access to medicine course” or to other areas and disciplines than medicine, that require use
of more general concepts “
OVT:
OVT3.1
Pilot site
OUNL
Pilot language
Dutch
Operational Validation Topic
Tutors judge that CONSPECT shows correctly the conceptual coverage of a given topic.
Questionnaire
type
Questionnaire no. & statement
Experimental /
control group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
Q6 The information CONSPECT provides me is accurate enough for
helping me perform my teaching tasks.
Experimental
2.2
0.84
0
5
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
Tutors reported that they identify important terms, but there were other important concepts that were missing:
“The concepts mentioned in these plots are pretty good. I do miss some important terms while others are superfluous, but I'd say
approx. 75-80% of what should be there is actually plotted”
There was also not clear to them why some concepts that are not present in the text do appear in the graph, while others that are
important and present in the text do not show up in the graph.
Page 80 of 349
D7.4 - Validation 4
OVT:
OVT3.2
Pilot site
UNIMAN
Pilot language
English
Operational Validation Topic
Tutors agree with the learner progress shown by Conspect
Questionnaire
type
Questionnaire no. & statement
Experimental /
control group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
Q42 Tutors agree with the learner progress shown by Conspect
Experimental
2.6
1.14
20%
5
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
Tutors found CONSPECT‟s outputs did not provide a clear picture of a learner‟s progress.
The visualisation of all student outputs combined is difficult to see. There is too much information on the screen to be able to
easily identify different learners.
“It needs to be more user friendly and the concept maps need to be clearer. They move about too much and it's difficult to
separate the different concepts. It's difficult to understand the joint concept maps.”
Page 81 of 349
D7.4 - Validation 4
OVT:
OVT3.2
Pilot site
OUNL
Pilot language
Dutch
Operational Validation Topic
Tutors agree with the learner progress shown by CONSPECT
Questionnaire
type
Questionnaire no. & statement
Experimental /
control group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
The way CONSPECT provides information (list concepts, graphical
representation) is useful to identify learners‟ progress.
Experimental
3
0.71
20%
5
Tutors
The way CONSPECT provides information (list concepts, graphical
representation) is useful to identify the progress of a group of learners
Experimental
3.4
0.55
40%
5
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
Tutors find it difficult to interpret the conceptograms.
“I find it hard to see what the graphs tell me. I do see the concepts, their overlap and omissions”; “I find it hard to make sense of
the graphs; clusters of concepts are usually okay, but I would expect some terms to be much more central or to mediate between
clusters. At other instances, I find unconnected dots that represents rather central terms”
They like better the idea of having a graph than a list of concepts, especially because the graph contains the relationships
between concepts.
OVT:
OVT3.3
Pilot site
UNIMAN
Pilot language
English
Operational Validation Topic
Students feel the feedback provided supports them in adapting their learning plans.
Questionnaire
type
Questionnaire no. & statement
Experimental /
control group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Students
Q31 The feedback provided supports me to adapt my learning plans.
Experimental
3.3
0.89
43%
16
Page 82 of 349
D7.4 - Validation 4
Formative results with respect to validation indicator
Stakeholder type
Results
Students
The results published were confusing. I found it very difficult to understand what the results signified. It would be simpler to
understand feedback in a score or percentage format.
OVT:
OVT3.3
Pilot site
OUNL
Pilot language
Dutch
Operational Validation Topic
Students feel the feedback provided supports them in adapting their learning plans .
Questionnaire
type
Questionnaire no. & statement
Experimental /
control group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Students
I think that the information CONSPECT provides helps me to be better
informed about my current learning progress.
Experimental
3.5
1.29
50%
4
Students
I think that comparing my graphical representation against a group
representation is useful to identify my progress.
Experimental
3.5
0.58
50%
4
Students
I think that comparing my graphical representation against a predefined
representation (e.g. Tutor representation) is useful to identify my
progress.
Experimental
4.0
0.82
50%
4
Students
I think that the information CONSPECT provides helps me to identify
knowledge gaps in my current learning progress.
Experimental
2.8
1.50
50%
4
Formative results with respect to validation indicator
Stakeholder type
Results
Students
Students were positive about the concept: “I like the concept, to see how concepts are related and what are the missing ones”
Page 83 of 349
D7.4 - Validation 4
“I like the idea a lot. I can see the concepts I‟ve miss in my summaries”
“I like the graph visualization to have a nice overview of what I have missed”
“I think it can become a handy tool in the future”
Some of the students did not consider that the group model was relevant for them. “I don‟t trust my peer‟s texts, I prefer to
have the tutor text”
Students mentioned that they do not understand why only the stem part of the word was showed, and find it difficult to
understand why some words that are not in the text appear in the conceptogram;
“I don‟t understand why only parts of the word are displayed, the only first part could mean different concepts”
Students also mentioned that they would like to have more information about the relations (semantic meaning), as well as
more information to interpret the map.
OVT:
OVT3.4
Pilot site
UNIMAN
Pilot language
English
Operational Validation Topic
Students agree with the feedback provided
Questionnaire
type
Questionnaire no. & statement
Experimental /
control group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Learners
Q32 (Learners) I agree with the feedback provided.
Experimental
3.1
0.72
31%
16
Formative results with respect to validation indicator
Stakeholder type
Results
Students
“It was helpful to confirm that I had covered some of the main concepts but the level of detail wasn't always so helpful.”
“I like comparing my own model with the intended learning outcomes model. This was helpful to identify weaknesses in my
blog and helped by suggesting topics to cover.”
“The results published were confusing. I found it very difficult to understand what the results signified. It would be simpler to
understand feedback in a score or percentage format.”
Page 84 of 349
D7.4 - Validation 4
OVT:
OVT4.1
Pilot site
UNIMAN
Pilot language
English
Operational Validation Topic
Students are able to position themselves whenever they want using CONSPECT
Questionnaire
type
Questionnaire no. & statement
Experimental /
control group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Students
Q33 I am able to position myself whenever I want
Experimental
3.3
0.79
50%
16
Formative results with respect to validation indicator
Stakeholder type
Results
Students
“Easier access than open ID. I found the login and access to the blogs very hit and miss.”
Students found the service did not always recall their results on the first attempt, there were some instances where the
software was not accessible, which resulted in students recording a low score for this question.
OVT:
OVT4.2
Pilot site
UNIMAN
Pilot language
English
Operational Validation Topic
Using CONSPECT, tutors are able to assess the conceptual progress of their students based on their reflective
documents.
Questionnaire
type
Questionnaire no. & statement
Experimental /
control group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
Q2 Overall, CONSPECT helps me to complete my teaching tasks
successfully.
Experimental
2.4
1.14
20%
5
Tutors
Q3 Overall, I believe that CONSPECT provides adequate support for
my teaching.
Experimental
2.4
1.14
20%
5
Tutors
Q4 Overall, I find CONSPECT useful in my teaching.
Experimental
2.4
1.14
20%
5
Tutors
Q38 I am able to assess the conceptual progress of my students based
on their reflective documents.
Experimental
2.4
1.52
40%
5
Page 85 of 349
D7.4 - Validation 4
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
Tutors (n=3) commented that the direct feedback provided by the service was not trustworthy and each analysis would require
further investigation prior to release of the feedback to ensure it was not misleading:
“A major concern is that the general nature of the concepts identified gives the students the wrong message that this level is
adequate preparation for module assessments. They are not adequate for this purpose and this makes tutor‟s jobs more
difficult than had they been giving feedback without CONSPECT”
Tutors
Tutor (n=1) was positive about the service‟s utility for this purpose but found the interface was not sufficiently intuitive.
OVT:
OVT4.2
Pilot site
OUNL
Pilot language
Dutch
Operational Validation Topic
Using CONSPECT, tutors are able to assess the conceptual progress of their students based on their reflective
documents.
Questionnaire
type
Questionnaire no. & statement
Experimental /
control group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
Q2 Overall, CONSPECT helps me to complete my teaching tasks
successfully.
Experimental
2.2
0.8
0
5
Tutors
Q3 Overall, I believe that CONSPECT provides adequate support for
my teaching.
Experimental
2.4
1.14
20%
5
Tutors
Q4 Overall, I find CONSPECT useful in my teaching.
Experimental
2.6
0.89
20%
5
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
Tutors reported that they could see potential in the method implemented within the service to identify areas for additional
support but that the information presented to them by the service in its present form did not provide a sufficient level of detail
Page 86 of 349
D7.4 - Validation 4
on which to base feedback.
Tutors were able to cross-check the outputs against student blogs and found this a useful way to “drill down” to examine
whether a student had covered a topic area effectively, but were critical that CONSPECT had not picked up some of the
important concepts and instead presented a large quantity of superficial data.
OVT:
OVT4.3
Pilot site
UNIMAN
Pilot language
English
Operational Validation Topic
Tutors are able to able to locate the outliers within their groups
Questionnaire
type
Questionnaire no. & statement
Experimental /
control group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
Q39 I am able to locate outliers within my group
Experimental
2.0
1.22
20%
5
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
Tutors reported that the conceptogram output was unclear and difficult to interpret with the number of students usually present
in a PBL case.
In interviews, tutors (n=5) reported that this was one of the most important features they would like the service to provide, but
were dissatisfied with the way the present visualisation provides this information.
OVT:
OVT4.3
Pilot site
OUNL
Pilot language
Dutch
Operational Validation Topic
Tutors are able to able to locate the outliers within their groups
Questionnaire
type
Questionnaire no. & statement
Experimental /
control group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
Q34. The way CONSPECT provides information (list concepts,
graphical representation) is useful to identify learners progress.
Experimental
3.0
0.71
20%
5
Page 87 of 349
D7.4 - Validation 4
Tutors
Q35. The way CONSPECT provides information (list concepts,
graphical representation) is useful to identify the progress of a group of
learners
Experimental
3.4
0.55
40%
5
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
Tutors find it difficult to interpret the conceptograms. Tutors reported that the conceptogram was difficult to interpret. The
combination of more than two conceptograms, show information that was impossible to interpret.
“I find it hard to see what the graphs tell me. I do see the concepts, their overlap and omissions”; “I find it hard to make sense
of the graphs; clusters of concepts are usually okay, but I would expect some terms to be much more central or to mediate
between clusters. At other instances, I find unconnected dots that represents rather central terms”
They like better the idea of having a graph than a list of concepts, especially because the graph contains the relationships
between concepts.
In interviews, tutors (n=5) also show concerns about the creation of the group model, a feature that the system does not
generate automatically, the task of creating an aggregated text could be very time consuming in big groups of students, and
could change constantly, making it difficult to create and maintain.
OVT:
OVT4.4
Pilot site
UNIMAN
Pilot language
English
Operational Validation Topic
Tutors are able to provide extra support for the problematic outliers during the learning process
Questionnaire
type
Questionnaire no. & statement
Experimental /
control group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
Q5 The CONSPECT service helps me to improve the quality of my
support to learners.
Experimental
2.6
0.89
20%
5
Tutors
Q40 I am able to provide extra support for the problematic outliers
during the learning process
Experimental
2.0
1.22
20%
5
Formative results with respect to validation indicator
Page 88 of 349
D7.4 - Validation 4
Stakeholder type
Results
Tutors
Tutors reported that they could see potential in the method implemented within the service to identify areas for additional
support but that the information presented to them by the service in its present form did not provide a sufficient level of detail
on which to base feedback.
Tutors were able to cross-check the outputs against student blogs and found this a useful way to “drill down” to examine
whether a student had covered a topic area effectively, but were critical that CONSPECT had not picked up some of the
important concepts and instead presented a large quantity of superficial data.
OVT:
OVT4.4
Pilot site
OUNL
Pilot language
Dutch
Operational Validation Topic
Tutors are able to provide extra support for the problematic outliers during the learning process
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
Q5 The CONSPECT service helps me to improve the quality of my support to
learners.
Experimental
2.8
1.30
40%
5
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
Tutors reported that they could see potential in the method implemented within the service to identify areas for additional
support but that the information presented to them by the service in its present form did not provide a sufficient level of detail
on which to base feedback.
Tutors were able to cross-check the outputs against student blogs and found this a useful way to “drill down” to examine
whether a student had covered a topic area effectively, but were critical that CONSPECT had not picked up some of the
important concepts and instead presented a large quantity of superficial data.
Page 89 of 349
D7.4 - Validation 4
OVT:
OVT5.1
Pilot site
UNIMAN
Questionnaire
type
Pilot language
English
Operational Validation Topic
Students are able to use the feedback given during the writing process to help them to improve the final texts
Questionnaire no. & statement
Q34 The feedback given during the writing process helps me to improve
the final texts.
Formative results with respect to validation indicator
Students
Experimental /
control group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Experimental
3.4
0.96
56%
16
Stakeholder type
Results
Students
“I have developed working on cues, from the concept list, and making the information concise”
OVT:
OVT6.1
Pilot site
UNIMAN
Pilot language
English
Operational Validation Topic
Students find the feedback provided encourages them to undertake further study
Questionnaire
type
Questionnaire no. & statement
Experimental /
control group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Students
Q35 The feedback given encourages me to undertake further study
Experimental
3.5
0.82
56%
16
Formative results with respect to validation indicator
Stakeholder type
Results
Students
“It has made me reflect on what work I actually did and writing the blog worked as a good was of looking back on what work I
had done and how much I had actually remembered. This way it worked as a revision-like process.”
It would probably also help in writing a better portfolio, as in each case the links between different topics would be clearer.
It would also serve as a good summarising tool for those such as myself who don't usually draw all of my notes together after
Page 90 of 349
D7.4 - Validation 4
studying each ILO in each case."
OVT:
OVT6.1
Pilot site
OUNL
Pilot language
Dutch
Operational Validation Topic
Students find the feedback provided encourages them to undertake further study
Questionnaire
type
Questionnaire no. & statement
Experimental /
control group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Students
18. Using CONSPECT increases my curiosity about the learning topic.
Experimental
3.0
0.82
25%
4
Students
19. CONSPECT makes learning more interesting.
Experimental
2.3
0.50
0%
4
Students
20. Using the CONSPECT motivates me to explore the learning topic
more fully.
Experimental
2.5
1.0
25%
4
Formative results with respect to validation indicator
Stakeholder type
Results
Students
“I like the graph visualization to have a nice overview of what I have missed”
A student like it the idea that some concepts that are not in the text are displayed in the conceptogram, this is a way of
“discovering new associations between concepts”.
OVT:
OVT6.2
Pilot site
UNIMAN
Pilot language
English
Operational Validation Topic
Students are confident that they are aware of those aspects of their conceptual coverage that are strong
Page 91 of 349
D7.4 - Validation 4
Questionnaire
type
Questionnaire no. & statement
Experimental /
control group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Students
Q36 I am confident that I am aware of those aspects of my conceptual
coverage that are strong
Experimental
3.6
0.96
56%
16
Formative results with respect to validation indicator
Stakeholder type
Results
Students
“within each case there were many different ideas we had to go away and study, and using a system like Conspect it helped
draw all those different ideas back into the case story. This made the links and reasons and dynamics between each topic
more stated.”
OVT:
OVT6.3
Pilot site
UNIMAN
Pilot language
English
Operational Validation Topic
Students who use CONSPECT feel better informed about their own learning.
Questionnaire
type
Questionnaire no. & statement
Experimental /
control group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Students
Q36 When I use CONSPECT I feel better informed about my own
learning.
Experimental
3.2
0.98
43%
16
Formative results with respect to validation indicator
Stakeholder type
Results
Students
“CONSPECT allows easy comparison of the ideas that me and my peers or a tutor have about the case that we summarised.”
Page 92 of 349
D7.4 - Validation 4
OVT:
OVT7.1
Pilot site
UNIMAN
Pilot language
English
Operational Validation Topic
There is a saving in institutional resources overall
Formative results with respect to validation indicator
Stakeholder type
Results
No evidence – The pilot presented an additional activity for the institution.
OVT:
OVT7.1
Pilot site
OUNL
Pilot language
Dutch
Operational Validation Topic
There is a saving in institutional resources overall
Formative results with respect to validation indicator
Stakeholder type
Results
No evidence – The pilot presented an additional activity for the institution.
OVT:
OVT8.1
Pilot site
UNIMAN
Pilot language
English
Operational Validation Topic
The service addresses one or more institutional objectives
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
“Student feedback is an issue for the School, specifically formative provision, so this service provides a solution to an
institutional problem. The challenge to adoption is that we don‟t ask students to provide the kinds of information this system
needs to work and there are the privacy issues around their data to consider so I don‟t think we would be able to make use of
it as it is now.”
Page 93 of 349
D7.4 - Validation 4
OVT:
OVT8.1
Pilot site
OUNL
Pilot language
Dutch
Operational Validation Topic
The service addresses one or more institutional objectives
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
C. I think CONSPECT addresses one of the burning problems of the
institution.
Experimental
3
1.58
40%
5
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
“Overall, I would say that CONSPECT has potential but is not yet developed far enough to make a significant contribution”
"Anything that helps time wise or money wise, we will grab by both hands and we will use it.”
Managers
A Manager suggested that the tool will address a key issue of the institution: the success rates of students that end their
studies. Studies show that lack of feedback is a key issue. With this tool you can provide feedback, repeatedly. “It is much less
frustrated for students to get a lot of feedback and when they have to handing their final essay to have some confidence that it
works. And that would keep students working, and don‟t get discouraged”
“It becomes very expensive if you want to provide feedback on writing essays. Certainly, you cannot do it for a reasonable cost
more than once, really you would like to do this repeatedly, but we are unable to do it, because it just takes too much
time....but it would help on the early stages to have automatic feedback on the things you are writing”
OVT:
OVT9.1
Pilot site
UNIMAN
Questionnaire
type
Pilot language
English
Operational Validation Topic
Users were motivated to continue to use the system after the end of the formal validation activities
Experimental
/ control
group
Questionnaire no. & statement
Page 94 of 349
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
D7.4 - Validation 4
Tutors
21. I would recommend this system to others.
Experimental
1.8
1.30
20%
5
Tutors
22. I am eager to explore different things with CONSPECT
Experimental
2.4
1.52
40%
5
Tutors
29. I would like to use the service after the pilot.
Experimental
2.0
1.22
20%
5
Tutors
30. If the service is available after the pilot, I will definitely use it.
Experimental
1.8
1.30
20%
5
Students
21 I would recommend this system to other learners to help them in their
learning.
Experimental
2.8
0.91
19%
16
Students
22 I am eager to explore different things with CONSPECT
Experimental
3.3
0.95
50%
16
Students
29 I would like to use the service in my learning activities after the pilot.
Experimental
3.5
0.89
50%
16
Students
30 If the service is available after the pilot, I will definitely use it in my
learning activities
Experimental
3.1
0.77
31%
16
OVT:
OVT9.2
Pilot site
UNIMAN
Pilot language
English
Operational Validation Topic
A high score was obtained in the generic questionnaires (based on UTAUT: likelihood of adoption by
users)
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Generic questionnaire - learners
Results:
Descriptive Statistics - Learners
N
Effectiveness
Mean
22
3,13
Std. Deviation
,805
Page 95 of 349
D7.4 - Validation 4
Efficiency
16
2,89
,516
Cognitive load
16
3,00
1,033
Usability
22
2,93
,867
Satisfaction
22
3,08
,752
Facilitating conditions
22
3,83
,795
Self-Efficacy
16
3,40
,772
Behavioural intention
22
3,16
,878
UNIMAN
22
3,13
,625
Valid N (listwise)
16
OVT:
OVT9.2
Pilot site
OUNL
Pilot language
Dutch
Operational Validation Topic
A high score was obtained in the generic questionnaires (based on UTAUT: likelihood of adoption by
users)
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology:
Results:
Descriptive Statistics - Learners
N
Effectiveness
Efficiency
Cognitive load
Usability
Mean
Std. Deviation
4 2.83
4 3.06
N/A
N/A
4 2.25
4 3.50
N/A
N/A
Page 96 of 349
D7.4 - Validation 4
Satisfaction
Facilitating conditions
Self-Efficacy
Behavioural intention
OUNL
Valid N (listwise)
OVT:
OVT9.3
Pilot site
OUNL
4 2.71
4 2.92
N/A
4 3.08
4 3.00
N/A
4 2.98
4
N/A
Pilot language
Dutch
N/A
N/A
Operational Validation Topic
Tutors attending a dissemination workshop give high scores to the question 'how likely are you to consider
adopting the service in your own educational practice?
Summative results with respect to validation indicator
Questionnaire
type
Questionnaire no. & statement
Experimental /
control group
Mean
Standard
deviation
%Agree /
Strongly
agree
Tutors
How likely are you to adopt CONSPECT in your own educational
practice?
Dissemination
2.8
1.36
38%
Page 97 of 349
n=
D7.4 - Validation 4
Section 4: Results – validation activities informing future changes / enhancements to the system
VALIDATION
ACTIVITY
Pilot partner: UNIMAN
Service language: English
Additional formative results (not associated with validation topics)
Alpha testing
Major changes identified during alpha testing but not yet implemented are:
Better handling of the creation of a group model is highly desirable, It is time-consuming to compare individual learner
progress with the group as a whole – more use of summary data is required.
The process of comparing conceptograms is not clear.
The visualisation of conceptogram comparison is not clear – there is no clear key / legend identifying the meaning of
the symbols.
There are two manually configured thresholds but no indication of the number range, whether they should be integer
or decimal, and implications of these for the processing and outputs.
When comparing 2 conceptograms, it is not immediately clear which is represented by which colour.
There are too many stem words in the results.
The terminology on the interface is not clear. The first screen the user sees has at the bottom the options “feeds” and
“concepts‟. They are not clear. Something like: “creation of conceptograms” (instead of feeds) and “list of
conceptograms” (instead of concepts) might give a better idea. It is also not clear what the lists show, there is no
description of what the interface is displaying.
Beta testing
Major changes identified during beta testing but not yet implemented are:
The ability to upload any kind of file, not just RSS feeds. There are privacy issues, such as the student materials
shouldn‟t be freely available in a blog.
It is not clear why and how the tool uses “tags”
There is no simple way of comparing progress of a learner over time. A time slider was discussed at previous
development meetings as a means of providing this feature but this has not yet been implemented.
Tutor interviews
Major changes identified by tutors are:
Page 98 of 349
D7.4 - Validation 4
It would be better to upload any kind of file, not only feeds
While clicking directly into a concept, this should directly point out the place where the concept has been found
(maybe highlight all the occurrences of the concept). Now, it goes directly to the whole text, without any advice.
Conceptograms need to be easier to read and render in a way that stops them from moving about on the screen.
Terminology used must be made more intuitive
Results don‟t provide guidance or direction for action
There are options in the interface that are useless for tutors or students: such as ”GraphML Data”
Tutor workshop(s)
Major changes identified by tutors attending the workshop are that the results are at too high a level and are not
sufficiently detailed enough to provide a useful basis on which to give feedback. The threshold management and
visualisation need to be improved to address these requirements, perhaps with a slider control as used in similar systems
(Leximancer for example).
Learner focus group
Major changes identified during the Focus Group –
Terms are unclear (stemmed terms) – the service needs to present complete words and make use of phrases instead of
presenting two words of the same concept as individual bubbles.
The relationships between terms are not detailed enough to be useful – too high a level.
Having access to the tutor blog is desirable for students.
The group conceptogram is difficult to use – the terms are difficult to read and it is impossible to see an individual‟s
connections. A simple way to do this might be to use colour to highlight an individual‟s connections when their name
(square?) is clicked on.
Teaching manager
interview
VALIDATION
ACTIVITY
Privacy and quality assurance issues need to be managed within the service to make it easily implementable in an
Undergraduate Medical School context.
Pilot partner: OUNL
Service language: Dutch
Additional formative results (not associated with validation topics)
Page 99 of 349
D7.4 - Validation 4
Alpha testing
Major changes identified during alpha testing but not yet implemented are:
Better handling of LSA parameters required, too complex and too time consuming. Furthermore, once the parameters
have been set, and the graph generated there is no way to go directly to modify the parameters.
While building a conceptogram that includes more than 1 blog post, the concepts in the concept list are ordered by
post, and not by their relevance considering all the selected posts.
Better handling of the creation of a group model will be highly desirable; it is very time-consuming to compare
individual learner progress with the group as a whole – aggregated data is required.
The comparison of more than 2 conceptograms is difficult to interpret. It shows rectangles but it is not clear what they
mean.
It is possible to exclude some concepts in the graph, but it would be also desirable to be able to group concepts that
mean the same for the user.
While comparing 2 conceptograms, the table that shows the missing and overlapping concepts should have headings,
the headings now are not informative (“graph 1”).
The conceptograms present too many stem words, that could mean different concepts in Dutch
Beta testing
Major changes identified during beta testing but not yet implemented are:
The level of detail that the conceptogram shows is not detailed enough when the corpus is not particular from that
topic. It could provide a general view, but tutors request a deeper level of detail “some terminology, that I really would
have expected to turn up in the concept maps, did not turn up, even though these are terms that are repeated quite
often through text inputs”.
Be able to upload any kind of file, not only feeds. There are privacy issues, such as the student materials shouldn‟t be
freely available in a blog.
The dates and times of the creation of “conceptograms” are incorrect, and this creates confusion
It is not clear why and how the tool uses “tags”
Tutor interviews
Major changes identified by tutors are:
User interface designed for end-users not from technical perspective: “we use terms the teachers use, you use terms
from your technology and our teachers don't, they know what a concept is, but it is quite different from yours. Tag,
they don't know what a tag is”
Page 100 of 349
D7.4 - Validation 4
Better handling of the parameters: “that you have to set the parameters, but you do not comprehend the meaning of
those parameters. So that should be turned around and presented in a tutor perspective, not in a technological
perspective”; “now it is too complicated and the things you said about just, well the user interface problems and
setting parameters that you don't understand. It happens to take a lot of steps”
It would be better to upload any kind of file, not only feeds
While clicking directly into a concept, this should directly point out the place where the concept has been found
(maybe highlight all the occurrences of the concept in the original text). At the moment, it goes directly to the whole
text, without any advice.
To have two different modes: tutor and student modes. In a tutor mode, tutors should be able to indicate the most
important concepts in advance.
Displaying one screen both the conceptogram and the list of concepts, side by side.
Percentages of completion (% of concepts that are covered by one text vs. the other text)
Easy integration with current University‟s VLE (Blackboard or Moodle)
There are options in the interface that are useless for tutors or students: such as ”GraphML Data”
Tutor workshop(s)
Major changes identified by tutors attending the workshop are
Indicate what is the relation between concepts (semantic meaning of the links)
Highlight of the concepts in the original text
Zoom in the conceptogram: view the underling concepts, links and additional texts
Importance of each one of the concepts in the whole graph (%)
Comparison of links between conceptograms.
Learner focus group 1
Major changes identified during learner Focus Group are:
Zoom in concepts and get more information about the concept, its relations with other concepts, etc.
Highlight the concepts in the original text
Show the semantic of the links
Comparison of links (my relations vs. Tutor‟s relations in a conceptogram)
Display the conceptogram side by side of the list of words
Page 101 of 349
D7.4 - Validation 4
Be able to manipulate the graph: make groups of words, annotate the map
Compare only parts of the text, or only certain concepts from the reference model
Identify why a concept is displayed in the graph (it it comes from the text or from the corpus only)
In the concept list: show major concepts, its value (%), tree structure (that shows the relation between the concept
and other concepts)
Teaching manager
interview
Integration with the institutional VLE (moodle)
The system has to have a very easy, intuitive and friendly interface
The output the system provides must be fully probed, so its quality is demonstrated
Page 102 of 349
D7.4 - Validation 4
Section 5: Results – validation activities informing transferability, exploitation and barriers to adoption
VALIDATION
ACTIVITY
Partner(s) involved:
UNIMAN
Service language:
English
Additional formative results (not associated with validation topics)
Alpha testing
Stemmed words – this issue must be addressed to enable tutors and students to see value in the service
Naming and time conventions must be improved
Service must permit different formats of document to be included
Beta testing
Tutors are concerned about the use of blogs as input, there are fundamental concerns about how privacy would be
managed in the system.
Tutors required significant support to use the system and found it overly complex to access the information they
required.
Tutors found the service was most useful to identify outliers but considered that the conceptogram output was not a
sufficient basis on which to provide feedback, the terms were not relevant.
Tutors found the OpenID login process unwieldy.
Tutor interviews
Tutor workshop(s)
All tutors mentioned that
o The tool is too complex for most people, the navigation and results presentation needs to be simple to use.
o The setting of parameters is too complex and with different blogs produced using different parameters, the
comparability of the outputs is brought into question
Tutors suggested alternative uses of the service, e.g.:
o Assess Masters level assignments on the MSc teaching certification course
o Check if the learning materials tutors are creating contain the most important concepts
o Assess different resources for their relevance to the learning objectives
Tutors were positive about using blogs with PBL:
“One point I would like to make was that while I was working on the blog I found it an extremely useful way of looking that
Page 103 of 349
D7.4 - Validation 4
bit deeper into the case (even though i had recently covered this one it is amazing how quickly we move on to the next
case), probably not a bad thing for a PBL tutor. If these blogs were to be produced for all cases in the future I think it
would be extremely beneficial for them to be made available to tutors as well as students being able to compare their own
blog. A group of tutors producing blogs may also be a good idea?”
Learner focus group 1
Students would be more prepared to use this if it was simpler to log in – using their institutional details with a single
log in for the blog and the service was identified as desirable.
Teaching manager
interview
Teaching managers see the value of CONSPECT as a catalyst to encourage tutors to provide formative feedback, if
the necessary processes and technologies were put in place. As learners responded positively to producing a blog for
this activity, the results will be used to inform future decisions about inclusion of such activities and specifications for
use of e-learning technologies in the new curriculum.
Workshop with
Faculty e-learning
experts
In answer to the question "How likely are you to adopt CONSPECT in your own educational practice?": mean 4.3 out
of 5, SD 0.75, 83% Agree/Strongly Agree (n=18)
VALIDATION
ACTIVITY
Partner(s) involved:
OUNL
Service language:
Dutch
Additional formative results (not associated with validation topics)
Alpha testing
Major issues encountered in transferring CONSPECT to Dutch:
o Stemmed: Too many stem words, that could mean different concepts
o Stop list, should contain adequate level of detail, otherwise the tool displays too many irrelevant words
Major issues encountered in transferring CONSPECT to Evolutionary Psychology domain:
o Specific Corpus: specific corpus about Evolutionary Psychology is not available in Dutch, most books and
references are in English, but students learn and write in Dutch.
Beta testing
There was considerable tutor‟s resistance to the use of blogs as input, as they do not have time to maintain blogs as
well as do their coursework. They are very concerned about privacy.
Page 104 of 349
D7.4 - Validation 4
Tutors were sceptical to try the tool, it seemed too complex
Tutors have problems on understanding what the graph is telling them.
Alternative way of sign-in besides OpenID
Tutor interviews
All tutors mentioned that
The tool is too complex; the interface should be simpler and easy to use.
The setting of parameters is too complex
They would need to have the tool integrated in the other technologies they use (Moodle, Blackboard) and it has to be
completely bullet proof.
Upload of any kind of file (word, .pdf), blogs are not well spread and have privacy issues.
Tutors pointed out different new uses of the services, such as:
Check bachelor thesis (comparing the traditional way of assessing them versus the results from CONSPECT)
Check if the learning materials tutors are creating contain the most important concepts
Generate a first outline (based on a set of input resources) to create study materials
Check plagiarism
Check different study material (books, references, etc), and compare them, to decide which one of them fits better the
learning objectives
Quality check of the learners text materials: learners might be obliged to produce a conceptogram to show they are
producing learning evidences before be able to get access to the model answers (currently, students get model
answers in an automatic way, if they type a possible answer they get the model, but most of them type only letters,
without really writing any answer)
Check internal coherence of texts, discourse, argumentation
Using in forums, to get the picture of the discussion, generate conceptogram about a group model
Tutor workshop(s)
Tutors identify the value of Conspect, but feel the tool was still too basic in terms of the semantic meaning of the links,
and the need of getting interactive feedback regarding the link between concepts and relationships,
Learner focus group 1
Most students mentioned that the interface was not intuitive. One student mention that he did like the idea that the
interface did not have too many options
Page 105 of 349
D7.4 - Validation 4
Learners mentioned that they will not use the system if it is too time consuming
Learners show concerns about the ethical use of the tool: that the tool could be used by tutors to get marks of
student‟s work
Teaching manager
interview
The manager indicated that the tool should be fully validated in different topics and courses.
The interface should be intuitive and integrated in the current VLE (Moodle)
Page 106 of 349
D7.4 - Validation 4
Transferability questionnaire: institutional policies and practices
UNIMAN is using the Blackboard platform; this is a hosted service and there is no opportunity to integrate CONSPECT. OUNL uses
BlackBoard and Moodle and there is a strong idea that any new service should be integrated into the University's VLE.
UNIMAN has strict privacy and ethics policies, which govern the way in which student data is anonymised and shared with both other
students and staff. The way the service handles student data needs to be amended to work with the existing privacy rules governing data
held on the VLE. Permissions could be inherited from existing data, or need to be determined manually, and the data needs to be
anonymised automatically for the service to be made available across the institution.
At OUNL tutors and students want to keep private their work, they are not willing to share their text with others in the Web. The tool uses
open feeds, and that have privacy issues for students and tutors. Students do not want to have their texts published, tutors do not want to
have their materials free available thru a blog.
Up to now, OUNL students do not receive any automatic feedback from a computer system. Tutors are afraid that students will not be
willing to interact with a feedback system instead that with a tutor.
UNIMAN staff and students do not use blogs. Staff and students undertook the writing of blogs as an additional activity to their normal
workload. Participants found the process of writing blogs about their practice extremely useful, and the pilot provides positive evidence of
the usefulness of this activity which could be used to inform institutional practice in the future.
OUNL tutors do not use Blogs in their teaching practice. Students are not asked to use blogs during their learning studies. Tutors are not
willing to incorporate any means outside the university‟s VLE. They want to be able to upload any kind of file, not only feeds.
OUNL students are very pragmatic, they focus on get the information and learning materials they need to study and do not have time to do
any extra effort. The tool is still too complex by itself. Particularly, the parameter section. For a student, only a single click to get the
feedback seems the most desirable way to gain adoption.
OUNL students are mostly adults, they are not digital natives (sometimes, they have problems to login to the normal VLE site. The tool
requires a to have substantial digital knowledge as, for example, understand what is an OpenID, what is a blog, a feed, etc.
OUNL tutors are willing to incorporate new tools only if they are designed in their own “vocabulary and terms”. The tool still is quite
technical oriented, and contains quite number of terms, parameters and information that tutors do not understand. For instance, the
parameter section is specially a problem, or the “GraphML Data” option is not useful for tutors.
Page 107 of 349
D7.4 - Validation 4
Transferability questionnaire: Relevance of the service in other pedagogic settings
Pedagogic setting
Reason(s)
Pedagogic settings for which the service would be suitable:
Service initially developed for Problem Based Learning
Any setting in which learners produce text materials based on
their knowledge of a given domain.
reasons: students can use the service to compare their written
materials to materials representing a desired level of domain knowledge
in any area for which the tool has been primed with an appropriate LSA
space.
Pedagogic settings for which the service would be less
suitable:
Any setting in which learners work principally with non-text
based materials, and where they are not required to write about
their learning of such.
reasons: Text-based learning materials are the input for the service.
Transferability questionnaire: Relevance of the service in other domains
Types of domain
Reason(s)
Types of domain for which the service would be suitable:
setting 1: Any domains where the primary discourse for
assessment is text-based. (no images, no formulas, no
procedural knowledge): literature, psychology, education, social
and human sciences.
reasons: LSA analyses words and relations in language, to establish
the closeness of concepts.
Types of domain for which the service would be less suitable:
setting 1: Domains where knowledge is practical, procedural.
E.g. engineering, mechanics
setting 2: Domains where knowledge is subjective and locally
contextualised, where no corpus in the desired language is
reasons: CONSPECT could identify that learners knew what terms
meant, e.g. from a glossary, but would not be best placed to assess
their knowledge of assembly / execution of tasks.
Page 108 of 349
D7.4 - Validation 4
available.
setting 3: Domains such as mathematics or chemistry where
much of the information is symbolic.
Page 109 of 349
D7.4 - Validation 4
Section 6: Conclusions
Validation Topics
OVT
Operational Validation Topic
Validated
unconditionally
Validated with
qualifications*
Not
validated
Qualifications to validation
PVT1: Verification of accuracy of NLP tools
OVT1.1
Tutors have assessed that a high percentage of
the concepts identified by CONSPECT are
relevant to the task the learners have
undertaken.
UNIMAN
Technical terms are stemmed and
need to be represented as
complete words to improve
accuracy. This would improve the
clarity of the results.
The detail showed in the
conceptogram is not deep enough
to represent the text. However,
the graph does include relevant
concepts.
OVT1.2
Tutors have calculated that a high proportion of
concepts have been correctly categorised as
important.
UNIMAN
OUNL
Level and detail of reporting
needs further refinement
(UNIMAN)
The stemming algorithm is not of
sufficient quality for the Dutch
language, the same stem could
refer to different concepts, the
same concept could have several
stems. This makes it difficult to
understand the graph and list of
concepts.
Page 110 of 349
D7.4 - Validation 4
OVT
Operational Validation Topic
OVT1.3
OVT1.4
Validated
unconditionally
Validated with
qualifications*
Not
validated
Qualifications to validation
Tutors have assessed that CONSPECT has
provided appropriate linkages between concepts
for most of the relevant conceptual relations.
UNIMAN
Level and detail of reporting
needs further refinement
Tutors have assessed that most of the concepts
have been correctly categorised as important.
UNIMAN
OUNL
Level and detail of reporting
needs further refinement
UNIMAN
OUNL
It takes more time to provide
feedback.
UNIMAN
OUNL
There is a higher cognitive load.
PVT2: Tutor efficiency
OVT2.1
Using CONSPECT, tutors spend less time
preparing feedback than without the system
OVT2.2 It is easier (there is less cognitive load) for tutors
to provide feedback using
CONSPECT
compared with not using the system
PVT3: Quality and consistency of (semi-)
automatic feedback OR information returned
by the system
OVT3.1
Tutors judge that CONSPECT shows correctly
the conceptual coverage of a given topic.
OUNL
UNIMAN
The complexity of the topic used
in the validation was too high for
CONSPECT to draw useful
results from. Tutors indicated that
conceptograms show important
concepts, but others that are also
relevant are not shown in the
graph.
OVT3.2
Tutors agree with the learner progress shown by
Conspect
OUNL
UNIMAN
Tutors perceive that the
information is potentially useful,
but the ambiguity of stemmed
Page 111 of 349
D7.4 - Validation 4
OVT
Operational Validation Topic
Validated
unconditionally
Validated with
qualifications*
Not
validated
Qualifications to validation
words was problematic.
OVT3.3
Students feel the feedback provided supports
them in adapting their learning plans.
OUNL
UNIMAN
OVT3.4
Students agree with the feedback provided
UNIMAN
Students found the process useful
but the presentation of results was
unclear.
Students indicated that they found
the outputs of CONSPECT
relevant, but that they would like
to have more information about
the concept itself and the meaning
of its relations. (OUNL)
PVT4: Making the educational process
transparent
OVT4.1
Students are able to position themselves
whenever they want using CONSPECT
OVT4.2
Using CONSPECT, tutors are able to assess the
conceptual progress of their students based on
their reflective documents.
UNIMAN
OUNL
OVT4.3
Tutors are able to able to locate the outliers
within their groups
UNIMAN
OVT4.4
Tutors are able to provide extra support for the
problematic outliers during the learning process
UNIMAN
OUNL
UNIMAN
PVT5: Quality of educational output
Page 112 of 349
Tutors were able in some cases to
identify outliers. No relevant
evidence from OUNL.
D7.4 - Validation 4
OVT
Operational Validation Topic
OVT5.1
Students are able to use the feedback given
during the writing process to help them to
improve the final texts
Validated
unconditionally
Validated with
qualifications*
Not
validated
UNIMAN
Qualifications to validation
Students felt they were able to
identify some concepts on which
they were then able to improve
their texts (UNIMAN)
PVT6: Motivation for learning
OVT6.1
Students find the feedback provided encourages
them to undertake further study
UNIMAN
OVT6.2
Students are confident that they are aware of
those aspects of their conceptual coverage that
are strong
UNIMAN
Seeing that others had covered
similar topic helped to positively
reinforce learners‟ confidence
OVT6.3
Students who use CONSPECT feel better
informed about their own learning.
UNIMAN
Seeing that others had covered
similar topic helped to positively
reinforce learners‟ confidence
OUNL
Students find the feedback useful,
but there are no clear identifiers
that the feedback provided
encourages them to undertake
further study. This has to be
validated further.
PVT7: Organisational efficiency
OVT7.1
There is a saving in institutional resources overall
N/A: There is no evidence to
support that there is a saving in
institutional resource. Students
are not presently provided with
formative feedback on written
materials of the format used within
the pilot.
PVT8: Relevance
Page 113 of 349
D7.4 - Validation 4
OVT
Operational Validation Topic
OVT8.1
The service meets one or more institutional
objectives
Validated
unconditionally
Validated with
qualifications*
Not
validated
UNIMAN
OUNL
Qualifications to validation
Tutors could identify new potential
uses of the tool, identify the
benefits of automatically
processing information, and see
how the tool could be used in
future plans of the university
(UNIMAN).
A Manager indicated that the tool
helps to provide continue
formative, which might help to
motivate students, and therefore
reduce drop-out rate (one of the
institutional objectives) (OUNL).
PVT9: Likelihood of adoption
OVT9.1
Users were motivated to continue to use the
system after the end of the formal validation
activities
OVT9.2
A high score was obtained in the generic
questionnaires (based on UTAUT: likelihood of
adoption by users).
OVT9.3
Tutors attending a dissemination workshop give
high scores to the question 'how likely are you to
consider adopting the service in your own
educational practice?
UNIMAN
Tutors were enthusiastic about
the potential of CONSPECT but
not its current implementation.
UNIMAN
OUNL
OUNL
Exploitation (SWOT Analysis)
Page 114 of 349
UNIMAN: Mean 3.13
OUNL: Mean 2.98
OUNL: Mean 2.8
D7.4 - Validation 4
The objective you are asked to consider is: "CONSPECT (v1.5) will be adopted in pedagogic contexts beyond the end of the project".
Strengths
The strengths of the system (v1.5) that would be positive indicators for adoption are:
The system provides formative feedback on demand to learners. This is a major institutional objective in both pilot
institutions.
The system provides a novel means by which to analyse a text document, and establish the presence and coverage
some of the key concepts that relate to a background corpus. To provide this information manually is time consuming
and resource intensive, specifically In learning environments with large number of learners who are required to
produce text and interactions (blog, forums).
The conceptual basis of the system was positively received by all stakeholders.
Weaknesses
The weaknesses of the system (v1.5) that would be negative indicators for adoption are:
Care should be taken that the tool produces input which has an adequate level of detail (i.e. concepts are specific
from the topic, not general concepts)
At present, the system only takes RSS feeds as input. Interoperability with other components of VLE is needed to
market it to potential vendors.
The Interface is too complex, and not user-oriented. Tutors expect that students will not have enough knowledge and
skills to use the tool.
The process and interface support to access and compare user outputs needs further improvement.
Opportunities
The system has potential as follows:
The system has the ability to assist tutors in identifying outliers quickly. Tutors were keen on this aspect of the service
The system can be used to extract the main points from reading materials quickly, so that tutors/learners can see
whether a long text is relevant to their practice or current study plans.
The system can be used to check whether or not students inputs are sufficient to get a „model answer‟ from the
institutional VLS.
The system could be used to extract the main points form a discussion forum, so tutors/learners can see the topics of
discussion and have a view of what the group is discussing
Threats
Both institutions reported that the software would be unlikely to be adopted because it is not part of the institutional
Page 115 of 349
D7.4 - Validation 4
VLE. The institutions are standardising on major corporate software in order to reduce maintenance costs.
Trust: Tutors and learners don‟t have confidence in the results of CONSPECT.
Overall conclusion regarding the likelihood of adoption of CONSPECT Version 1.5:
CONSPECT shows potential for adoption in other institutions to meet major institutional objectives of feedback on demand and personalised
support for learners. The ability to check whether students are able to produce a piece of text from the course and identify the percentage of
students who are not considering key terms is attractive. However, the service requires further enhancement to realise its potential.
Most important actions to promote adoption of CONSPECT:
Solve interface issues. The interface should user-centered. Minimize learner/tutor input (particularly, LSA parameters).
Provide guidance to interpret the conceptograms, to assist users understanding of them
Recognize the importance and impact of the corpus (detail, language) and thresholds required to produce useful outputs
Test integration with VLE (e.g., Moodle, Blackboard, elgg)
Provide aggregated statistical reports on learner progress
Section 7 – Road map
Based on the results and conclusions from validation, the LTfLL team has agreed that the following are the most important future
enhancements to the system in order to meet stakeholder requirements:
Most important:
1. Present complete words and meaningful phrases to be of use to students and tutors.
2. Handle multiple input formats
3. Provide simple time-based view of an individual‟s conceptual evolution
Page 116 of 349
D7.4 - Validation 4
4. Improve the visualisation of comparisons of conceptogram models to make it more simple to identify specific individuals.
Based on the results and conclusions from validation, the LTfLL team has agreed that the following are the most important changes to
the current scenario(s) of use in order to meet stakeholder requirements:
Most important:
1. CONSPECT will be of greater value in lower-level courses where the knowledge requirements are based on fewer topics with greater focus.
2. Users should maintain a blog over the duration of a module, not simply over the duration of service use, to provide a richer dataset for analysis
by the service.
Based on the results and conclusions from validation, the LTfLL team has agreed that the following are possible additional educational
contexts for future deployment:
Most important:
1. The system could be used to extract the main points from a discussion forum, so tutors/learners can see the topics of discussion and have an
overview of what the group has been discussing
2. The system could be used to support collaborative writing: students write first their own text, and then create conceptograms to compare with
others, as well as a group model. This is input for discussion and to go further in a new version of the text
3. The system could be used to analyze data collection from research (focus groups, interview text), and extract key concepts mentioned in these
data
Based on the results and conclusions from validation, the LTfLL team has agreed that the following are the most important issues for
future technical research to enable deployment of language technologies in educational contexts:
Most important:
1. Understanding how language relating to process and procedural knowledge can be analysed meaningfully.
2. Understanding how language analysis of discourse about entities and their attributes differs from language about processes and procedural
knowledge
3. Identifying how student discourse about their skills and competence can best be captured in formats that lend themselves to language pattern
analysis.
Page 117 of 349
D7.4 - Validation 4
4. Understanding how these technologies are applied to the learner in his / her knowledge domain, and how the outputs can be best be formulated
to allow him/her to plan his learning trajectory.
Roadmap - validation activities
Further validation planned for beyond the end of the project:
Claim (OVT): Tutors have identified that a conceptogram displays mainly the most relevant concepts in the input text, stem words and irrelevant
words are not predominant in the graph
Methodology:
1. Comparison of conceptograms against input text
2. Tutor interview
Objective (OVT):
Students find the feedback provided helps them to understand better the topic and encourages them to undertake further study
Methodology:
1. Thematic analysis of students behaviour within the VLE or system (experimental and control group)
2. Test examination (marks experimental and control group)
3. Student questionnaire
Objective (OVT):
Tutors feel that the system reduces their workload and that they can focus on problematic outliers during the learning process
Methodology:
1. Tutors questionnaire
2. Student questionnaire
Page 118 of 349
D7.4 - Validation 4
Page 119 of 349
D7.4 - Validation 4
Appendix B.4 Validation Reporting Template for WP5.1 (PUB-NCIT and UNIMAN)
Section 1: Functionality implemented in Version 1.5 and alpha / beta-testing record
Brief description of functionality
Version
number of unit
Changes from Version 1.0
Tutor view: Assignment Maintenance (add /
define, edit, delete)
V1.5
Added forum assignments functionality – did not previously exist.
Added import LSA space from the R-LSA framework – did not previously exist.
Thus new spaces were created for the Medicine pilot and the Long Thread
validation on Web2.0. Other new spaces are easy to import and use.
Tutor view: Conversation Maintenance (add /
upload, edit / define, delete, process /
analyze)
V1.5
Added possibility to process CSV files that contain transcripts of discussion
forums.
Student and tutor view: Conversation
Feedback
V1.5
Improved functionality by adding suggested concepts from the semantic space
that are missing from the conversation in order to give the learners a
suggestion on what to study next.
Implemented clustering of concepts for providing more accurate concept
groups, in order to display to the users only the most relevant concepts from
each cluster (for missing concepts especially).
Improved the appearance (changed labels, added headings) of the feedback
by taking into consideration the results from v1.0 validation round.
Student and tutor view: Conversation
Visualization
V1.5
Adapted the visualization for being able to display discussion forum threads
(lower number of posts than for chats and greater number of participants).
Student and tutor view: Utterance Feedback
V1.5
Added new functionality for labelling the posts in a discussion forum using
Page 120 of 349
D7.4 - Validation 4
Garrison and Anderson‟s “Model of Inquiry” to separate posts that contain
teaching, social and cognitive presence.
The analysis of v1.0 showed that the grading of the utterances was problematic
because of fact that the content score was not high enough and therefore less
important utterances received greater scores. The proposed solution was
modifying the grading algorithm by taking into account the scores assigned by
tutors to utterances in v1.0 validation and modify the factors of each scoring
component (content score, social score, utterance structure score, etc.) The
new results are showing an improvement.
Changed the linguistic patterns for several speech acts in order to increase the
accuracy of the speech acts identification, when using the golden standard
corpora.
Student and tutor view: Participant Feedback
V1.5
The analysis of v1.0 showed that the grading of the participants regarding the
on-topic (content) score was problematic due to the fact that some participants
may have used important concepts, but only from a given part of the semantic
space. Therefore, the proposed solution was to add a bonus factor for using
important concepts that are not in the same cluster in the semantic space.
Improved social score to also take into account the implicit links, but with a
lower importance than the explicit links.
Student and tutor view: Search Conversation
V1.5
Improved ranking of the search results, by making use the concept clustering
ability. This way, for each keyword the query is expanded only with the terms in
its cluster, thus reducing the search time with more than 10 times (especially
important for large semantic spaces).
Alpha-testing
Pilot site and language
PUB-NCIT, English
Date of completion of alpha testing:
October 15 , 2010
th
Page 121 of 349
D7.4 - Validation 4
Who performed the alpha testing?
Traian Rebedea, Mihai Dascalu
Pilot site and language
UNIMAN
Date of completion of alpha testing:
7 October 2010
Who performed the alpha testing?
Alisdair Smithies, Isobel Braidman
Beta-testing
Pilot site and language: PUB-NCIT, English
Has stakeholder validation taken place using the service embedded in Elgg (Yes/No/Partially): No
If ‘No’ or ‘Partially’, give reasons: Preferred to run the pilot using the stand-alone version of PolyCAFe as this seemed more suitable (e.g. larger
screen space, no other information except the PolyCAFe widgets are presented) for the users. The Long
Thread is using the Elgg version of PolyCAFe‟s widgets which have also been alfa and beta-tested by the
same people.
beta-testing performed by:
Iulia Pasov (student), Claudiu Mihail (student), Costin Chiru (tutor), Alexandru Gartner (tutor)
beta testing environment (stand-alone service / integrated into Elgg):
stand-alone service
th
HANDOVER DATE:
October 26 , 2010
(Date of handover of software v.1.5 for validation)
Pilot site and language: UNIMAN, English
Has stakeholder validation taken place using the service embedded in Elgg (Yes/No/Partially): No
If ‘No’ or ‘Partially’, give reasons: Preferred to run the pilot using the stand-alone version of PolyCAFe, hosted by PUB, as this is more
appropriate for the users. The Long Thread is using the Elgg version of PolyCAFe.
Page 122 of 349
D7.4 - Validation 4
beta-testing performed by:
Zara Sandiford (student), Santosh Tadi (student), Maria Regan (tutor)
beta testing environment (stand-alone service / integrated into Elgg):
stand-alone service, hosted by PUB
th
HANDOVER DATE:
November 5 , 2010
(Date of handover of software v.1.5 for validation)
Page 123 of 349
D7.4 - Validation 4
Section 2: Validation Pilot Overview
NB Information about pilot sites, courses and participants has been transferred to Appendix A.3
Pilot task
Pilot site: PUB-NCIT
Pilot language: English
What is the pilot task for learners and how do they interact with the system?
The learners are divided into groups of 5 students (5 experimental and 2 control groups) and are given two successive chat assignments related to
Human-Computer Interaction to debate using ConcertChat. The experimental group was asked to use PolyCAFe to get feedback for each
assignment. The control group did not use PolyCAFe for the first assignment. The use of PolyCAFe for the second assignment is not mandatory,
so the learners have an option to use PolyCAFe only if they think it would be useful for them. The two topics for the assignments are:
-
A debate about the best collaboration tool for the web: chat, blog, wiki, forums and Google Wave. Each student shall choose one of the 5
tools and shall present its advantages and the disadvantages of the other tools. Thus, you will act as a "sales person" for your tool and try
to convince the others that you have the best offer (act as a marketer - http://www.thefreedictionary.com/marketer). You must also
defend your product whenever possible and criticize the other products if needed.
-
You are in the board of decisions of a company that plans to use collaborative technologies for its activities. Each of you has studied the
advantages and disadvantages of the following technologies that are considered by the company: chat, blog, wiki, forums and Google
Wave. Engage into a collaborative discussion in order to decide for which activities it is indicated to use each technology. You
should give the best advice for the technology that you support and convince the others to use it. The result of this discussion
should be a plan of using these technologies in order to have the best outcomes for your company. You can also think of other useful
technologies beside these ones, but do not insist on them.
What do the learners produce as outputs? Are the outputs marked?
The learners produce as outputs two chat conversation logs for each group. These outputs do not count in the mark for the HCI course as not all
the students attending the course participated in the validation experiment. However, the chat conversations are marked by the tutors for the
verification experiment.
Page 124 of 349
D7.4 - Validation 4
How long does the pilot task last, from the learners starting the task to their final involvement with the software?
th
th
The pilot task runs for about a month: November 11 – December 10 2010.
How do tutors/student facilitators interact with the learners and the system?
The tutors define the assignments in PolyCAFe and the list of relevant topics for each assignment.
The tutors provide manual feedback to each of the students involved in a chat conversation for the first assignment. Each tutor assesses one
conversation without using PolyCAFe and one conversation using PolyCAFe. No manual feedback is provided for the second assignment, only the
feedback provided by PolyCAFe.
Each tutor uses PolyCAFe to help him assess and provide final (manual) feedback to 2-3 chat conversations.
Describe any manual intervention of the LTfLL team in the pilot:
There are no manual interventions done in the pilot.
Pilot site: UNIMAN
Pilot language: English
What is the pilot task for learners and how do they interact with the system?
The learners are divided into groups of 7 or 8 students (4 experimental and 2 control groups) and are given a forum assignment related to
debating professional practice in Medicine. Then, they are asked to use PolyCAFe and get feedback (the two control groups did not use PolyCAFe
initially, they received the results of the Polycafe analysis at the end of the discussion).
What do the learners produce as outputs? Are the outputs marked?
The learners produce as outputs a discussion forum for each group. These outputs do not count in the mark for the course as not all the students
attending the course participated in the validation experiment. All students participate in the discussions but only a small sample of groups have
been used for the pilot. Formal feedback is not normally provided on the forums and they are not marked by a teacher/tutor, they are moderated
by a student facilitator, who is trained in online facilitation techniques.
The activity of assessing the forums using PolyCAFe and viewing the feedback it generates is therefore an additional task within the learning
environment.
Page 125 of 349
D7.4 - Validation 4
How long does the pilot task last, from the learners starting the task to their final involvement with the software?
th
th
The pilot task runs for a month: November 15 – December 15 2010.
How do tutors/student facilitators interact with the learners and the system?
The facilitators guide a discussion about professionalism. All students participate in this activity, with student facilitators leading each group.
The facilitators participating in the pilot use the feedback provided by PolyCAFe to direct their guidance and can choose to share the outputs with
the students involved in a discussion, via email.
Describe any manual intervention of the LTfLL team in the pilot:
Compilation of the forums into spreadsheets, adapting outputs into a format that can be processed by PolyCAFe, distributing results of analysis via
email to student facilitators.
Experiments
Name of experiment: Experiment A – PUB-NCIT
Objective(s):
Determine the relative quality of the manual feedback provided by tutors with and without using PolyCAFe
Details:
Each chat conversation for the first assignment is provided with manual feedback from 4 tutors: 2 of them use PolyCAFe and 2 without using it.
After that, the tutors decide which manual feedback is better (i.e. the feedback informed with / without PolyCAFe) by using a set of common
indicators: quality of feedback related to participation and collaboration, quality of feedback related to the content of the conversation, coverage of
the feedback.
Name of experiment: Experiment B – PUB-NCIT
Objective(s):
Determine the quality of the participants‟ grading provided by PolyCAFe
Details:
Each tutor that does not use PolyCAFe for giving manual feedback for a particular chat conversation (look at Experiment A, for each conversation
Page 126 of 349
D7.4 - Validation 4
2 tutors provided manual feedback without PolyCAFe) and each student provides a ranking in order of merit of the participants to the chat
conversation they attend, by considering (1) content, 2) collaboration and participation and, 3) overall. The ranking orders produced by tutors and
learners were compared with the ones provided by PolyCAFe.
Thus, Tutors (6) and students (35) manually ranked the participants to each of the 7 chat conversations for the first assignment. For each chat
conversation, there have been 5 rankings from the students, plus two from the tutors that did not use PolyCAFe for providing manual feedback.
The average ranking for each participant in a conversation was then computed and it was compared to the one provided by PolyCAFe for content
and social impact.
Page 127 of 349
D7.4 - Validation 4
Section 3: Results - validation/verification of Validation Topics
OVT:
OVT1.1
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic
The tutors/experts find that the speech acts discovered in the conversation (chat or forum) are correct.
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Tutors (2) manually annotated two chat conversations with speech acts. Precision and recall were computed for
each speech act.
Results:
Average speech act class precision: 85%
Average speech act class recall: 70%
The following table contains the precision and recall for each speech act class.
Speech act - label
Continuation
Statement
Greeting
Accept
Partial accept
Agreement
Understanding
Negative
Reject
Partial reject
Action directive
Info request
Thanks
Maybe
Precision
93%
94%
100%
92%
71%
90%
96%
97%
73%
35%
75%
100%
100%
100%
Page 128 of 349
Recall
92%
93%
80%
80%
55%
51%
58%
78%
82%
27%
90%
71%
100%
69%
D7.4 - Validation 4
Conventional
Personal opinion
Sorry
OVT:
OVT1.1
Pilot site
UNIMAN
Pilot language
English
66%
100%
66%
50%
36%
75%
Operational Validation Topic
The tutors/experts find that the speech acts discovered in the conversation (chat or forum) are correct.
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Tutors (2) manually annotated three sets of results from PolyCAFe. Relevance of the words reported on the topics
covered in the forum posts was assessed on a scale of 1 to 10, 10 being most relevant to the conversation.
Results:
37% (Frequency of concepts – of the words reported, 37% were appropriate)
33% (Topics detected that were relevant to the conversation)
17% (Relevant noun topics)
OVT:
OVT1.2
Pilot site
UNIMAN
Pilot language
English
Operational Validation Topic
The tutors/experts find that the labels corresponding to Garrison‟s community of inquiry model in a
forum are correct. FORUMS ONLY
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Tutors (2) manually annotated three sets of results from PolyCAFe. Accuracy of the feedback reported on the
Garrison Community of Inquiry model was assessed on a scale of 1 to 10, 10 being most relevant to the post assessed.
Results:
63% individual CoI feedback considered accurate categorisation.
Page 129 of 349
D7.4 - Validation 4
50% information about collaboration was 50% correct, some instances of “BAD” result reported in instances where progress was good.
Page 130 of 349
D7.4 - Validation 4
OVT:
OVT1.3
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic
The tutors/experts find that the scores assigned to the utterances are correct.
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: The tutors (3) manually annotated two chat conversations from the first assignment with scores (from 1-4) for
each utterance. In order to be able compute the inter-rater agreement, one chat conversation was annotated by two tutors.
Results:
Chat 1 (331 utterances):
Tutor 1 – Tutor 2 (inter-rater) correlation: 61%
Tutor 1 – PolyCAFe correlation:
60%
Tutor 2 – PolyCAFe correlation:
41%
Tutor average – PolyCAFe correlation:
57%
Chat 2 (277 utterances)
Tutor – PolyCAFe correlation:
55%
(No inter-rater data)
OVT:
OVT1.4
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic
The tutors/experts find that the scores assigned to the participants for a given concept and globally are
correct.
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Experiment B – PUB-NCIT
Results:
The average precision of the ranking provided by PolyCAFe is 67% (computed for all the 35 students, in 7 different conversations) and 77%
Page 131 of 349
D7.4 - Validation 4
precision with the tutors (computed on all the 7 conversations).
The average distance between the manual ranking and PolyCAFe‟s ranking is 0.43 when compared to students‟ rankings and 0.23 when
compared to the tutors‟ ranking.
The following table highlights the precision, correlation and average distance between the rankings provided by the students, the tutors and
PolyCAFe.
Rankings compared
Tutors – System
Students – System
Tutors – Students
OVT:
OVT1.5
Pilot site
PUB-NCIT
Pilot language
English
Correlation
94%
84%
84%
Precision
77%
66%
71%
Average distance
0.23
0.43
0.40
Operational Validation Topic
The tutors/experts find that the PolyCAFe correctly identifies the important (relevant) concepts from the
conversation.
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: The tutors (3) manually annotated three chat conversations from the first assignment with the most important 10
concepts and the next 10 important concepts (if any).
Results:
By comparing the concepts provided by the tutors for the three chats and comparing them with the most important 30 topics determined by
PolyCAFe, there has been a precision of: 18/28 = 64%.
No inter-rater agreement data, as not all the tutors have fulfilled this task.
Page 132 of 349
D7.4 - Validation 4
OVT:
OVT2.1
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic
Tutors/facilitators spend less time preparing feedback for learners compared with traditional means.
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Time measurements for preparing the feedback by the tutors – each chat conversation has been analyzed and
provided feedback for by 4 tutors – 2 using PolyCAFe and 2 not using the system. Tutor questionnaire.
Results:
Average time needed to prepare feedback without PolyCAFe: 84 minutes, standard deviation: 15 minutes
Average time needed to prepare feedback with PolyCAFe: 55 minutes, standard deviation: 20 minutes
Average time saved = (84 – 55) / 84 = 35%
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
7. It takes less time to complete my teaching tasks using PolyCAFe than
without the system.
Experimental
4.5
0.84
83%
6
Tutors
8. Using PolyCAFe enables me to work more quickly than without the
system.
9. I do not wait too long before receiving the requested information.
Experimental
4.5
0.84
83%
6
Experimental
4.8
0.41
100%
6
10. PolyCAFe provides me with the requested information when I require
it (i.e. at the right time in my work activities).
34. I spend less time preparing feedback to learners than without the
system.
35. I find that using PolyCAFe is a very time-efficient way of providing
feedback.
36. I find the information needed to write the feedback for the learners
more quickly using PolyCAFe than without it.
Experimental
4.7
0.52
100%
6
Experimental
4.5
0.84
83%
6
Experimental
4.3
0.52
100%
6
Experimental
4.7
0.52
100%
6
Tutors
Tutors
Tutors
Tutors
Tutors
Page 133 of 349
D7.4 - Validation 4
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
One tutor said: “very useful tool in reducing time” and all the others agreed.
“It is easier to assess collaboration, involvement and determine the most important concepts and parts of the conversation.”
OVT:
OVT2.1
Pilot site
UNIMAN
Pilot language
English
Operational Validation Topic
Tutors/facilitators spend less time preparing feedback for learners compared with traditional means.
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
7. It takes less time to complete my teaching tasks using PolyCAFe than
without the system.
Experimental
2
0.71
0%
5
Tutors
8. Using PolyCAFe enables me to work more quickly than without the
system.
9. I do not wait too long before receiving the requested information.
Experimental
2
0.71
0%
5
Experimental
3
0.00
0%
5
Experimental
2.8
0.45
0%
5
Experimental
2.4
1.14
20%
5
Experimental
2.6
1.14
20%
5
Experimental
2.6
1.14
20%
5
Tutors
10. PolyCAFe provides me with the requested information when I require
it (i.e. at the right time in my work activities).
34. I spend less time preparing feedback to learners than without the
Tutors
system.
35. I find that using PolyCAFe is a very time-efficient way of providing
Tutors
feedback.
36. I find the information needed to write the feedback for the learners
Tutors
more quickly using PolyCAFe than without it.
Formative results with respect to validation indicator
Tutors
Stakeholder type
Results
Student
Facilitators felt that PolyCAFE presented them with too much information and required a lot of interpretation to make sense of
Page 134 of 349
D7.4 - Validation 4
Facilitators
OVT:
OVT2.2
the results. They felt that the context of the discussion threads had been lost in the way the results were presented, due to the
anonymization process.
“There‟s a lot of information but it‟s not clear which… what it means, how do I use it? What changes do I need to make to the
guidance I‟m giving to help people improve their contributions? I can‟t find that out without reading through all the
information… I seriously think it will probably take longer than if I just look at the forums themselves”
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic
It is easier (there is less cognitive load) for tutors/facilitators to provide feedback using PolyCAFe
compared with just reading the learners‟ online conversations.
Summative results with respect to validation indicator
Questionnaire
type
Questionnaire no. & statement
11a. Please rank on a 5-point scale the mental effort (1 = very low
mental effort; 5 = very high mental effort) you invested to accomplish
teaching tasks using PolyCAFe.
11b. Overall, using the system requires significantly less mental effort to
Tutors
complete my teaching tasks than when using normal chat transcripts.
37. I find it easier to analyze a chat conversation using PolyCAFe than
Tutors
without it.
38. There is a lot of information in the conversation that I cannot process
Tutors
without PolyCAFe.
39. I find that the discussion threads and their inter-animation are difficult
Tutors
to follow without PolyCAFe.
Formative results with respect to validation indicator
Tutors
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Experimental
3.0
0.63
16%
6
Experimental
4.8
0.41
100%
6
Experimental
4.8
0.41
100%
6
Experimental
4.0
0.63
83%
6
Experimental
4.8
0.41
100%
6
Stakeholder type
Results
Tutors
“The cognitive load is much higher when not using PolyCAFe because the task is very difficult”
“It would be much more difficult not to use PolyCAFe”
Page 135 of 349
D7.4 - Validation 4
OVT:
OVT2.2
Pilot site
UNIMAN
Pilot language
English
Operational Validation Topic
It is easier (there is less cognitive load) for tutors/facilitators to provide feedback using PolyCAFe
compared with just reading the learners‟ online conversations.
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Student
Facilitators
Experimental
3.8
0.84
60%
5
Student
Facilitators
11a. Please rank on a 5-point scale the mental effort (1 = very low
mental effort; 5 = very high mental effort) you invested to accomplish
teaching tasks using PolyCAFe.
11b. Overall, using the system requires significantly less mental effort to
complete my teaching tasks than when using normal chat transcripts.
Experimental
2.8
1.30
40%
5
Student
Facilitators
37. I find it easier to analyze a chat conversation using PolyCAFe than
without it.
Experimental
2.6
1.34
40%
5
Student
Facilitators
38. There is a lot of information in the conversation that I cannot process
without PolyCAFe.
Experimental
2.4
0.89
0%
5
Student
Facilitators
39. I find that the discussion threads and their inter-animation are difficult
to follow without PolyCAFe.
Experimental
2.2
0.84
0%
5
Formative results with respect to validation indicator
Stakeholder type
Results
Student
facilitators
The facilitators found the results of the analysis took a lot of effort to interpret compared with following the threads. It was
difficult to see which feedback related to which thread, and to make sense of the figures returned.
“It's not clear how to address the feedback, how do I act on it?”
The numbers allocated had no reference attached to them on which to gauge what they meant. The groups comprised up to 8
students – the facilitators felt that the PolyCAFE service would be more appropriate in a larger group setting.
“the visualisation quickly gives a clear idea of who‟s participating but that‟s clear anyway because the group is quite small”
Page 136 of 349
D7.4 - Validation 4
OVT:
OVT3.1
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic
Tutors/facilitators perceive that the feedback received from the system helps them prepare feedback
for learners.
Summative results with respect to validation indicator
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
6. The information the system provides me is accurate enough for
helping me perform my teaching tasks.
Experimental
3.8
0.41
83%
6
Tutors
Experimental
4.2
0.75
83%
6
Experimental
4.5
0.84
83%
6
Tutors
40. PolyCAFe provides feedback that is relevant to my preparation of
learner feedback.
41. PolyCAFe provides feedback that is useful to my preparation of
learner feedback.
42. PolyCAFe 's feedback is sufficiently accurate to inform my feedback.
Experimental
3.8
0.41
83%
6
Tutors
43. I trust PolyCAFe to provide helpful feedback.
Experimental
4.3
0.52
100%
6
Tutors
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
“The feedback provided by PolyCAFe helps you to easily identify the important parts of a conversation and to get a quick
overview of the discussion”
“The visualization of the conversation is extremely useful”
“Not always the feedback is exact, but it does not influence the evaluation of the tutors”
Page 137 of 349
D7.4 - Validation 4
OVT:
OVT3.1
Pilot site
UNIMAN
Pilot language
English
Operational Validation Topic
Tutors/facilitators perceive that the feedback received from the system helps them prepare feedback
for learners.
Summative results with respect to validation indicator
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Student
Facilitators
6. The information the system provides me is accurate enough for
helping me perform my teaching tasks.
Experimental
2.2
0.45
0%
5
Student
Facilitators
40. PolyCAFe provides feedback that is relevant to my preparation of
learner feedback.
Experimental
2.6
1.14
20%
5
Student
Facilitators
41. PolyCAFe provides feedback that is useful to my preparation of
learner feedback.
Experimental
2.6
1.14
20%
5
Student
Facilitators
42. PolyCAFe 's feedback is sufficiently accurate to inform my feedback.
Experimental
2.4
1.14
20%
5
Student
Facilitators
43. I trust PolyCAFe to provide helpful feedback.
Experimental
2.6
1.14
20%
5
Formative results with respect to validation indicator
Stakeholder type
Results
Student
Facilitators
One facilitator felt that the feedback was a helpful addition to their own observations of the discussion forums.
Page 138 of 349
D7.4 - Validation 4
OVT:
OVT3.2
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic
Learners perceive that the feedback received from the system contributes to informing their study
activities.
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: System logging
Results:
285 visits to PolyCAFe, 1447 page-views were achieved between November, 1st – December 1st (more than 40 page-views per student).
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Learners
6. The information the system provides me is accurate enough for
helping me perform my learning tasks.
Experimental
3.7
0.52
60%
25
Learners
31. PolyCAFe provides feedback that is relevant to my study activities.
Experimental
3.9
0.91
72%
25
Learners
32. PolyCAFe provides feedback that is useful to my study activities.
Experimental
3.8
0.85
72%
25
Learners
33. PolyCAFe's feedback is sufficiently accurate to inform my study
activities.
34. I trust PolyCAFe to provide helpful feedback.
Experimental
3.8
0.88
64%
25
Experimental
4.0
0.87
80%
25
Learners
Formative results with respect to validation indicator
Stakeholder type
Results
Learners
“PolyCAFe is very useful to assess inter-connectivity and inter-animation, plus concept analysis.”
“Important tool for analyzing the conversations of my other colleagues in order to see whether I am position within the class.”
“The relevant concepts are very useful”
“useful for assessing the collaboration”
Page 139 of 349
D7.4 - Validation 4
“very relevant for the progressive analysis of a group that has more than 2 discussion”
OVT:
OVT3.2
Pilot site
UNIMAN
Pilot language
English
Operational Validation Topic
Learners perceive that the feedback received from the system contributes to informing their study
activities.
Summative results with respect to validation indicator
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Learners
6. The information the system provides me is accurate enough for
helping me perform my learning tasks.
Experimental
3.4
0.98
48%
21
Learners
31. PolyCAFe provides feedback that is relevant to my study activities.
Experimental
3.6
0.98
62%
21
Learners
32. PolyCAFe provides feedback that is useful to my study activities.
Experimental
3.6
1.03
71%
21
Learners
33. PolyCAFe's feedback is sufficiently accurate to inform my study
activities.
34. I trust PolyCAFe to provide helpful feedback.
Experimental
3.4
1.03
57%
21
Experimental
3.5
1.08
62%
21
Learners
Formative results with respect to validation indicator
Stakeholder type
Results
Learners
Individual students found the feedback on their utterances interesting and useful in that it allowed them to see the ways in
which their utterances had been classified. Some students (n=5 agreed in a focus group) felt that some of the feedback was
irrelevant at a group level but useful to individuals to see their own performance in the group and it motivated them to develop
their responses.
“The assessment of my own statements was helpful... it let me see how I‟d responded to others and I could work on my
answers and see the changes in the way the system classified my response... it was interesting”
Page 140 of 349
D7.4 - Validation 4
OVT:
OVT3.3
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic
The feedback given by different tutors/facilitators to the same student is more consistent using
PolyCAFe than without using it (there is more homogeneity among the responses provided to
learners).
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Experiment B – PUB-NCIT. Two tutors manually analyzed all the feedback provided by the tutors for the first
assignment (with and without PolyCAFe) in order to find differences in consistency between feedback provided by different tutors to the same chat
conversation.
Results:
Each chat conversation has been provided feedback for by two tutors using PolyCAFe and two tutors not using PolyCAFe. All 14 pairs of feedback
were evaluated relatively to the consistency of the provided feedback on collaboration + inter-animation, content and overall. The marks were from
1 (very inconsistent) to 5 (very similar/consistent feedback).
There has been a slight increase in the collaboration and inter-animation assessment when using PolyCAFe (average grade for consistency
related to this issue increased with 20%). However, the average grade for feedback related to content and overall feedback had only a very small
increase for consistency (11%, respectively 13%).
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
“The feedback writing style is very difficult to change and it seems to have had a great influence in the feedback written by
each tutor regardless of using it (the system) or not…”
Page 141 of 349
D7.4 - Validation 4
OVT:
OVT3.4
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic
The feedback provided by tutors/facilitators after using PolyCAFe is more extensive (higher quality)
than without using the system.
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Experiment B – PUB-NCIT. Two tutors manually analyzed all the feedback provided by the tutors for the first
assignment (with and without PolyCAFe) in order to find differences in quality between feedback provided by different tutors to the same chat
conversation.
Results:
Each chat conversation has been provided feedback for by two tutors using PolyCAFe and two tutors not using PolyCAFe. The quality of each
feedback was graded using scores from 1 to 5 by the two tutors on the collaboration + inter-animation quality, content quality and overall quality.
The overall quality has increased by merely 10% (17% collaboration and just 5% content quality).
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
“The quality of the feedback using PolyCAFe could have been better, but I did not want to copy the information already
delivered by the system”
Page 142 of 349
D7.4 - Validation 4
OVT:
OVT4.1
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic
Using PolyCAFe, tutors/facilitators monitor the learner‟s participation in online discussions better:
detect conversations with bad/good collaboration, discover the coverage of concepts of each
participant and other differences between learners.
Summative results with respect to validation indicator
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
44. PolyCAFe is effective for monitoring the quality of collaboration.
Experimental
4.0
0.63
83%
6
Experimental
4.3
0.52
100%
6
Experimental
4.7
0.52
100%
6
45. PolyCAFe is effective for determining the extent of concept coverage
of each participant.
46. PolyCAFe is effective for determining the relevant concepts in the
Tutors
conversation
Formative results with respect to validation indicator
Tutors
Stakeholder type
Results
Tutors
The tutors all agreed that PolyCAFe is useful for monitoring the learner‟s participation in online discussions better than any
alternative. However, there has been a debate that it is difficult to find a perfect measure for collaboration and several
measures should be investigated and compared to the current scores computed by PolyCAFe.
“PolyCAFe is very helpful to assess the degree of collaboration, which cannot be computed correctly without assistance”
Page 143 of 349
D7.4 - Validation 4
OVT:
OVT4.1
Pilot site
UNIMAN
Pilot language
English
Operational Validation Topic
Using PolyCAFe, tutors/facilitators monitor the learner‟s participation in online discussions better:
detect conversations with bad/good collaboration, discover the coverage of concepts of each
participant and other differences between learners.
Summative results with respect to validation indicator
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Student
facilitators
43. PolyCAFe is effective for monitoring the quality of collaboration.
Experimental
2.6
1.14
20%
5
Student
facilitators
44. PolyCAFe is effective for determining the extent of concept coverage
of each participant.
Experimental
2.8
1.30
40%
5
Student
facilitators
45. PolyCAFe is effective for determining the relevant concepts in the
conversation
Experimental
2.8
1.10
20%
5
Formative results with respect to validation indicator
Stakeholder type
Results
Student
facilitators
Student facilitators felt that the way in which the results are presented makes the quality aspects of collaboration unclear.
Note: Individual feedback was removed for the pilot due to concerns about the ethical implication of presenting individuals with
feedback that was rated “BAD”.
“I didn‟t feel comfortable with the numbers, what do they mean? When it says “bad” it doesn‟t say in what way it‟s bad”.
Tutors
“there are many aspects of the discussions that were not picked up by PolyCafe”
“Group X showed empathy towards the children and the pain they may have gone through in the example used but I didn't
feel this was acknowledged in the feedback given. On the whole it seemed more like a frequency list of key concepts.”
Page 144 of 349
D7.4 - Validation 4
OVT:
OVT4.2
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic
The system provides learners with information that helps them reflect better on their performance as
individuals and as group members compared with traditional means.
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Student questionnaire (comparison to control group)
Results:
PolyCAFe is considered a big improvement in comparison to the current system (reading the chat log/transcript in any format), especially with
regard to allowing the students to reflect on their performance as an individual: the average score has increased from 2.7 to 4.2, while the
agreement rate has almost tripled from 30% to 80%. Moreover, PolyCAFe offers a better alternative for reflecting on each student‟s contribution in
the conversation as a member of a group, with an average score increased from 3.1 (control group) to 4.4 (experimental group), and the
agreement rate more than doubled from 40% to 88%.
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Learners
35. PolyCAFe helps me reflect on my performance as an individual.
Experimental
4.2
0.88
80%
25
Learners
35. The current system (reading the chat transcript/log) helps me reflect
on my performance as an individual.
Control
2.7
1.12
30%
10
Learners
36. PolyCAFe helps me reflect on my contribution as a member of a
group.
Experimental
4.4
0.81
88%
25
Learners
36. The current system (reading the chat transcript/log) helps me reflect
on my contribution as a member of a group.
Control
3.1
1.17
40%
10
Formative results with respect to validation indicator
Stakeholder type
Results
Learners
PolyCAFe has been considered very useful for helping the students reflect on their performance (either individual or part of a
group). The learners considered that it would have been even more useful if the topics of the conversation would have been
Page 145 of 349
D7.4 - Validation 4
somewhat more unfamiliar to them or inter-disciplinary.
“PolyCAFe is really helpful to compare you with people from other groups, outside my chat group” (by looking at their feedback
provided by PolyCAFe and comparing it with yours)
OVT:
OVT4.2
Pilot site
UNIMAN
Pilot language
English
Operational Validation Topic
The system provides learners with information that helps them reflect better on their performance as
individuals and as group members compared with traditional means.
Summative results with respect to validation indicator
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Learners
35. PolyCAFe helps me reflect on my performance as an individual.
Experimental
3.5
1.08
57%
21
Learners
35. The current system (reading the chat transcript/log) helps me reflect
on my performance as an individual.
Control
3.6
0.79
30%
7
Learners
36. PolyCAFe helps me reflect on my contribution as a member of a
group.
Experimental
3.6
1.08
62%
21
Learners
36. The current system (reading the chat transcript/log) helps me reflect
on my contribution as a member of a group.
Control
3.1
1.07
57%
7
Formative results with respect to validation indicator
Stakeholder type
Results
Learners
“I like the list of words, it prompted me to consider the topics that we had covered” (n=6 agree in focus group 1)
OVT:
OVT4.3
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic
The visualization offers the users a better understanding of chat conversations and discussion forums.
Summative results with respect to validation indicator
Page 146 of 349
D7.4 - Validation 4
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
47. PolyCAFe‟s visualization is effective for monitoring the quality of
collaboration.
Experimental
4.5
0.55
100%
6
Tutors
48. PolyCAFe‟s visualization is useful for determining the inter-animation
of concepts and ideas (social learning).
Experimental
4.0
0.63
83%
6
Learners
37. PolyCAFe‟s visualization is effective for monitoring the quality of
collaboration.
Experimental
4.4
0.64
92%
25
Learners
37. The current visualization (reading the chat transcript/log in html
format) is effective for monitoring the quality of collaboration.
Control
2.7
1.41
30%
10
Learners
38. PolyCAFe‟s visualization is useful for determining the inter-animation
of concepts and ideas (social learning).
Experimental
4.2
0.72
84%
25
Learners
38. The current visualization (reading the chat transcript/log in html
format) is useful for determining the inter-animation of concepts and
ideas (social learning).
Control
2.7
1.41
30%
10
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
“The visualization of the conversation is extremely useful”
“can get a easy overview of the collaboration, implication, number of links and discussion threads”
Learners
“the inter-animation of concepts is difficult to determine without the visualization”
“good representation of the conversation”... “permits you to find out where you had a good collaboration, how many colleagues
linked to you”
Page 147 of 349
D7.4 - Validation 4
OVT:
OVT4.3
Pilot site
UNIMAN
Pilot language
English
Operational Validation Topic
The visualization offers the users a better understanding of chat conversations and discussion forums.
Summative results with respect to validation indicator
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Student
Facilitators
47. PolyCAFe‟s visualization is effective for monitoring the quality of
collaboration.
Experimental
2.8
1.10
20%
5
Student
Facilitators
48. PolyCAFe‟s visualization is useful for determining the inter-animation
of concepts and ideas (social learning).
Experimental
3.0
0.71
20%
5
Learners
37. PolyCAFe‟s visualization is effective for monitoring the quality of
collaboration.
Experimental
3.8
1.18
71%
21
Learners
37. The current visualization (reading the chat transcript/log in html
format) is effective for monitoring the quality of collaboration.
Control
3.3
0.95
57%
7
Learners
38. PolyCAFe‟s visualization is useful for determining the inter-animation
of concepts and ideas (social learning).
Experimental
3.9
1.2
67%
21
Learners
38. The current visualization (reading the chat transcript/log in html
format) is useful for determining the inter-animation of concepts and
ideas (social learning).
Control
3.4
1.13
57%
7
Formative results with respect to validation indicator
Stakeholder type
Results
Student
Facilitators
Could not see quality indicators in the visualisation – suggested that it could only be used to quantify extent of participation.
Learners
Students commented that they could see the interactions and described these as “helpful” but the visualisation did not expose
the concepts and ideas being discussed. They felt that this would work better if shown with an individual discussion thread,
Page 148 of 349
D7.4 - Validation 4
rather than as an overall analysis of the group‟s whole forum.
OVT:
OVT5.1
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic
Learner performance in online discussions is improved in the areas of content coverage and
collaboration when using PolyCAFe.
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Measurements
Results:
Based on scores computed automatically by PolyCAFe:
Experimental group for the
second assignment
Control group for the second
assignment
Improvement over control group
Average
score
for
a
chat
conversation
(collaboration
+
content)
6.80
6.37
(6.80-6.37)/6.37 = 6.8%
Average importance of the most
important 20 concepts
0.194
0.192
1.2%
Average number of utterances
351
338
(351-338)/338 = 3.8%
Average distribution of (implicit and
explicit) links between utterances
1.12
0.87
(1.12-0.87)/0.87 = 29%
For all the indicators presented above, the chat conversations for the experimental groups were better than those of the control group. However,
only for collaboration (average number of links/utterance) and for total average score of a utterance there was found a substantial increase
between the two groups.
Stakeholder type
Results
Learners
The learners felt that the feedback for the first set of assignments gave them indicators that showed that they needed to
Page 149 of 349
D7.4 - Validation 4
collaborate better and be more involved in the discourse. Moreover, the sets of concepts that were not covered, also offered
them some insight. They also said that the task was a bit simple and they would have found the feedback even more useful if
discussing about topics that they knew less about.
OVT:
OVT6.1
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic
The direct feedback provided by the system encourages learners to undertake further study to address
gaps in their coverage.
Summative results with respect to validation indicator
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Learners
18. Using PolyCAFe increases my curiosity about the learning topic.
Experimental
3.7
0.98
56%
25
Learners
20. Using the system motivates me to explore the learning topic more
fully.
Experimental
3.5
1.05
40%
25
Learners
22. I am eager to explore different things with PolyCAFe.
Experimental
3.6
1.12
48%
25
Formative results with respect to validation indicator
Stakeholder type
Results
Learners
The learners said that the topic of the chat conversations was not very difficult for them and this was one of the reasons why
the feedback was not as useful as it would have been for a more difficult topic. They also said that the feedback would be
more interesting for open-questions and discussions related to humanities, economics or law, for example.
OVT:
OVT6.1
Pilot site
UNIMAN
Pilot language
English
Operational Validation Topic
The direct feedback provided by the system encourages learners to undertake further study to address
gaps in their coverage.
Page 150 of 349
D7.4 - Validation 4
Summative results with respect to validation indicator
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Learners
18. Using PolyCAFe increases my curiosity about the learning topic.
Experimental
3.3
0.75
43%
21
Learners
20. Using the system motivates me to explore the learning topic more
fully.
Experimental
3.0
0.92
33%
21
Learners
22. I am eager to explore different things with PolyCAFe.
Experimental
3.3
1.23
43%
21
Formative results with respect to validation indicator
Stakeholder type
Results
Learners
The types of feedback provided were not perceived as providing constructive direction for further study.
“Conversation feedback and utterance feedback needs to be clearer to give me directions on what I need to do”
Some students (n=5 agreed in a focus group) felt that some of the feedback was irrelevant at a group level but useful to
individuals to see their own performance in the group and it motivated them to develop their responses.
“The assessment of my own statements was helpful... it let me see how I‟d responded to others and I could work on my
answers and see the changes in the way the system classified my response... it was interesting”
OVT:
OVT7.1
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic
There is a saving in institutional resources overall*
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Measurements, Focus group with tutors
Results:
Average time needed to prepare feedback without PolyCAFe: 84 minutes, standard deviation: 15 minutes
Page 151 of 349
D7.4 - Validation 4
Average time needed to prepare feedback with PolyCAFe: 55 minutes, standard deviation: 20 minutes
Average time saved = (84 – 55) / 84 = 35%
For a class with 100 students and 2 chat assignments, the total time gained by the tutors that use PolyCAFe for providing manual feedback for the
50 chat groups would be 29 * 50 = 1450 minutes = 24 hours that could be used more efficiently.
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
The tutors said that PolyCAFe might be very useful if used as an assessment standard for various assignments and to
determine whether (novice) tutors are giving correct feedback.
OVT:
OVT8.1
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic
The service addresses one or more institutional objectives
Formative results with respect to validation indicator
Stakeholder type
Results
Teaching
manager
Economy of the university:
The most important aspect is the economy for the University from many points of view (time and availability of tutors, for
example), but mainly it should be first an improvement in the educational process and in the knowledge that the students
acquire. The head of the computer science department also said that PolyCAFe would be very useful for other departments in
the University.
“I am also convinced that this can be an economy for the University from many points of view, but mainly it should be first an
improvement in the educational process and in the knowledge that the students acquire…”
Student intellectual and social development:
It is important for students that work in teams, especially that there are more and more projects that require team-work and
collaboration.
Chat is a very used application by students, it is also important to be able and support its use for the “professional student”.
“The tool helps for clarifying different aspects until the students become acquainted with a certain subject”
Supporting tutors in assessment:
“It could be useful even for the teaching staff in order to coordinate activities between the same class, but which has different
Page 152 of 349
D7.4 - Validation 4
professors and tutors”.
“I think that every course could benefit from using the system, but the question is to which extent”, “as the tutors can easily
have a view of the level of the class, of the degree of collaboration”
“I think that the product has a great potential and it would be very useful for other departments as well… for example, the
department of economics, pedagogy, …”
Page 153 of 349
D7.4 - Validation 4
OVT:
OVT8.1
Pilot site
UNIMAN
Pilot language
English
Operational Validation Topic
The service addresses one or more institutional objectives
Formative results with respect to validation indicator
Stakeholder type
Results
Teaching
manager
Provision of formative feedback: “We need to provide high quality feedback to students, formative feedback is an area that
we have been underperforming in so systems such as the one you‟ve shown could go some way towards addressing the need
for us to improve.”
OVT:
OVT9.1
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic
Users were motivated to continue to use the system after the end of the formal validation activities
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Measurements
Results:
th
th
Number of visits between December, 26 and February, 6 : 334 (approximately 100 visits could be from the long thread validation experiment)
th
th
Number of pageviews between December, 26 and February, 6 : 1164
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
21. I would recommend this system to other teachers to help them in
their teaching.
Experimental
4.8
0.41
100%
6
Tutors
22. I am eager to explore different things with PolyCAFe.
Experimental
4.3
0.53
100%
6
Page 154 of 349
D7.4 - Validation 4
Tutors
29. I would like to use the service in my teaching after the pilot.
Experimental
4.8
0.41
100%
6
Tutors
30. If the service is available after the pilot, I will definitely use it in my
teaching.
Experimental
4.8
0.41
100%
6
Learners
21. I would recommend this system to other learners to help them in their
teaching.
Experimental
3.8
1.08
60%
25
Learners
22. I am eager to explore different things with PolyCAFe.
Experimental
3.6
1.12
48%
25
Learners
29. I would like to use the service in my learning activities after the pilot.
Experimental
3.4
1.04
48%
25
Learners
30. If the service is available after the pilot, I will definitely use it in
learning activities.
Experimental
3.3
1.06
44%
25
OVT:
OVT9.1
Pilot site
UNIMAN
Pilot language
English
Operational Validation Topic
Users were motivated to continue to use the system after the end of the formal validation activities
Summative results with respect to validation indicator
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
21. I would recommend this system to other teachers to help them in
their teaching.
Experimental
3
1.58
40%
5
Tutors
22. I am eager to explore different things with PolyCAFe.
Experimental
3
1.22
40%
5
Tutors
29. I would like to use the service in my teaching after the pilot.
Experimental
3.2
1.48
40%
5
Tutors
30. If the service is available after the pilot, I will definitely use it in my
teaching.
Experimental
3.2
1.48
40%
5
Learners
21. I would recommend this system to other learners to help them in their
Experimental
3.5
1.08
57%
21
Page 155 of 349
D7.4 - Validation 4
teaching.
Learners
22. I am eager to explore different things with PolyCAFe.
Experimental
3.3
1.23
43%
21
Learners
29. I would like to use the service after the pilot.
Experimental
3.3
1.19
48%
21
Learners
30. If the service is available after the pilot, I will definitely use it
Experimental
3.2
1.08
38%
21
OVT:
OVT9.2
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic
A high score was obtained in the generic questionnaires (based on UTAUT: likelihood of adoption by
users)
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Generic questionnaire - learners
Results:
Descriptive Statistics - Learners
N
Mean
Std. Deviation
Effectiveness
31
3,53
,630
Efficiency
31
3,60
,735
Cognitive Load
31
4,29
,902
Usability
31
4,05
,786
Satisfaction
31
3,76
,753
Facilitating conditions
31
3,78
,723
Self-Efficacy
31
3,86
,783
Behavioural intention
31
3,50
1,000
PolyCafe PUB-NCIT
31
3,75
,576
Page 156 of 349
D7.4 - Validation 4
Valid N (listwise)
31
Page 157 of 349
D7.4 - Validation 4
OVT:
OVT9.2
Pilot site
UNIMAN
Pilot language
English
Operational Validation Topic
A high score was obtained in the generic questionnaires (based on UTAUT: likelihood of adoption by
users)
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Generic questionnaire - learners
Results:
Descriptive Statistics - Learners
N
Mean
Std. Deviation
Effectiveness
28
3,32
,700
Efficiency
28
3,38
,658
Cognitive load
28
3,00
1,186
Usability
28
3,71
,717
Satisfaction
28
3,37
,782
Facilitating conditions
28
3,73
,685
Self-Efficacy
28
3,61
,732
Behavioural intention
28
3,29
1,040
PolyCafe UNIMAN
28
3,46
,610
Valid N (listwise)
28
Page 158 of 349
D7.4 - Validation 4
OVT:
OVT9.3
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic
Tutors attending a dissemination workshop give high scores to the question 'how likely are you to
adopt the service?'
Summative results with respect to validation indicator
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
How likely are you to adopt PolyCAFe in your own educational practice?
Dissemination
4.05
0.62
84%
19
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
“It can be an useful tool for collaborative learning, especially to stimulate the interest and the aptitudes of the students in order
to participate in an authentic debate on a given subject.”
“the applicability is practically unlimited when organizing free chats”
“considering to apply PolyCAFe on forums and taking decisions based on the options/discussions of the students”
OVT:
OVT10.1
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic
Learners are more involved and motivated with the course by using PolyCAFe.
Formative results with respect to validation indicator
Stakeholder type
Results
Learners
The learners said that using PolyCAFe offered them useful information for the second assignment, especially related to
increasing collaboration and concepts that were not covered.
The learners considered that it is very important to see the results of their colleagues in order to be able to compare
themselves with other peers. Moreover, they said that using PolyCAFe for more than two assignments would be more
Page 159 of 349
D7.4 - Validation 4
motivating and useful for them.
Page 160 of 349
D7.4 - Validation 4
Section 4: Results – validation activities informing future changes / enhancements to the system
VALIDATION
ACTIVITY
Pilot partner:
PUB-NCIT
Service language:
English
Additional formative results (not associated with validation topics)
Alpha testing
Correction of importance of topics, concepts and overall importance to be shown with the same number of decimals
(two decimals)
Beta testing
Slight modification of the grading algorithm by correcting some bugs
Tutor interviews
Improve the help section.
Improve the navigation from a widget to another (maybe have a non-widget version)
Make the transition smoother from local feedback (participant or utterance) to global feedback (conversation
feedback or visualization)
Implement a detailed scoring for the utterances
Improve the conversation threads detection algorithm
Utterance feedback: highlight important concepts in an utterance, colour utterances according to their importance
Implement support for Math formulas and, maybe, for simple scripted tasks
Tutor workshop(s)
Implement automatic LSA training from online resources for various topics and domains, in order to configure the
system more easily
Develop interfaces to popular discussion forum interfaces, like phpBB, Moodle, etc.
Extend to scripted activities or activities that require the students to reach certain learning outcomes: develop a way
to measure the coverage of each learning outcome.
Learner focus group 1
Improve the interface by moving all the widgets into a single window, use of tabs, use of icons
Improve the help, provide tool-tips, visual hints, “what is this?” buttons
Easier to upload any kind of chat log
Page 161 of 349
D7.4 - Validation 4
Better privacy and user management
Allowing the possibility to “track” a participant with different usernames in different chat conversations
Provide feedback real-time during a chat conversation
Extend the analysis on a team-level basis in order to compare different teams/chats with each other
The current feedback should be rephrased, maybe in a more directed manner, like giving an advice to the
participants instead of a score
Learner focus group 2
(prioritisation of
enhancements)
Learners judged that the five most important areas for enhancement of the system (i.e. clusters) are:
1. Improve the interface, especially by not using the widgets.
2. Improving the help and examples section.
3. Real time feedback for chat conversations.
4. Extend the analysis to provide additional feedback that determines the position of a team within the group of all the
teams that solved an assignment (also rank teams, not only participants inside a team).
5. The feedback should be expressed in a more direct manner (like an advice), instead of just providing scores and
indicators.
Learners judged that the five most important single improvements that should be made to the system are:
1. Implement an interface that has a single widget or a non-widget version.
2. Extend the help section to contain examples for each functionality.
3. Reformulate the feedback to sound more like an advice.
4. Also rank teams, not only participants inside a team.
5. Integrate PolyCAFe‟s feedback into a real chat environment.
Teaching manager
interview
I think that every course could benefit from using the system, but the question is to which extent
Other (please specify)
Development team: Extend PolyCAFe to the Romanian language.
Determine in what pedagogical scenarios PolyCAFe is most useful
Page 162 of 349
D7.4 - Validation 4
VALIDATION
ACTIVITY
Pilot partner:
UNIMAN
Service language:
English
Additional formative results (not associated with validation topics)
Alpha testing
Removal of individual feedback service – Good / Bad results do not align with institutional policy
Beta testing
Refinement of topics used in LSA space on which grading was based – an activity was undertaken by the team at
UNIMAN to produce a set of MESH headings to underpin PolyCAFe‟s use with the professional development forums.
These were used to generate a new semantic space for the third validation round.
Facilitator interviews
Results must be accompanied by tooltip help – providing a user handbook too unwieldly
The quantity of data shown needs to be reduced and reporting needs to be more concise
The data needs to be shown at an individual thread level of the discussion forum to be contextually grounded for the
viewer
Tutors felt the feedback wasn‟t sufficiently objective and of a high enough standard to make further use of this on a
stand-alone basis, but felt that if the accuracy of the reporting and the summary data were to be improved it could be
used for reporting, to provide summary data on a dashboard for all groups, rather than as a direct feedback
mechanism.
Tutors felt, however, that with improved usability and reliable summary reporting, the service could provide a useful
aid to their practice.
Tutor workshop(s)
Tutors saw value in the system as an automated means to provide real-time feedback to students
Tutors liked the chat visualisation as it gave them an immediate sense of whether learners were engaging
Tutors disliked word ratings, presented without a clear rationale.
Learner focus group
Improve the usability – the widgets are not intuitive
Improve the help, provide tool-tips, visual hints, “what is this?” buttons
The numbers are not meaningful and the text feedback is not constructive
Provide constructive suggestions for ways in which the recipient of the feedback might improve
Page 163 of 349
D7.4 - Validation 4
Learner focus group 2
(prioritisation of
enhancements)
Teaching manager
interview
Learners judged that the five most important areas for enhancement of the system (i.e. clusters) are:
1. Usability – navigation is not intuitive
2. Performance indicators – numerical data represents a high cognitive load for students to interpret
3. Unclear how to address feedback – Suggestions for this must be displayed where scores are low
Learners judged that the five most important single improvements that should be made to the system are:
1. Provide “less of more” – better summary data
2. Improve the way in which text feedback is displayed, use colour to identify good and bad, instead of words/numbers
3. Feedback on individual utterances would be more helpful if the user could navigate the utterances in forum format, list
view is difficult to follow.
Provide the service as a dashboard to allow tutors to view and compare performance across multiple groups.
Improve the reliability of the service and adjust the workflow of how feedback is provided to address institutional
requirements for feedback.
Page 164 of 349
D7.4 - Validation 4
Section 5: Results – validation activities informing transferability, exploitation and barriers to adoption
VALIDATION
ACTIVITY
Partner(s) involved:
PUB-NCIT
Service language:
English
Additional formative results (not associated with validation topics)
Tutor interviews
PolyCAFe would not be useful for classes requiring special skills or that need to solve practical problems or artistic
classes.
Also not good for Maths or other classes with formulas, neither for scripted activities.
Suggestion: use PolyCAFe for discussions on European projects between partners to highlight differences in ideas,
concept coverage, collaboration, etc.
PolyCAFe might be very useful if used as an assessment standard for various assignments and to determine which
are the (novice) tutors that are not giving correct feedback.
Tutor workshop(s)
Problems with getting the corpora required for the latent semantic spaces for a lot of subjects: there should be a
mechanism to use available online text or books
Very useful for free chatting
The tutors highlighted the difficulty to find relevant discussion subjects for some domains/courses that are are
suitable for being analyzed by PolyCAFe
The tutors wanted feedback with regard to completing or reaching learning objectives during a dicussion
Learner focus group 1
Concerns with data privacy.
Some students do not feel the need for feedback after being engaged in a chat conversation.
Usability problems for those users that are not very experienced with software tools.
It would be more useful for open-topic discussions.
Useful for domains like law, marketing, social sciences, etc.
Very useful to have PolyCAFe‟s feedback for subjects where the learners are novice (or not very advanced).
Teaching manager
“I think that the product has a great potential and it would be very useful for other departments as well… for example,
Page 165 of 349
D7.4 - Validation 4
interview
the department of economics, pedagogy, …”
“There might be a problem with the acceptance of other teaching mangers of this combined form of face to face and
online (hybrid) learning”
“I am also convinced that this can be an economy for the University from many points of view, but mainly it should be
first an improvement in the educational process and in the knowledge that the students acquire…”
Other (please specify)
VALIDATION
ACTIVITY
Major issues encountered in transferring PolyCAFe to Romanian:
o No open-source POS tagging software is available. Solution: build one from annotated corpora that is
available freely
o No Romanian WordNet is available for open usage. Solution: use dictionary entries instead
o The stemmers for Romanian are not as good as for English. Solution: use a lemmatizer.
Partner(s) involved:
UNIMAN
Service language:
English
Additional formative results (not associated with validation topics)
Alpha testing
Major issues encountered in transferring PolyCAFe to the Medicine domain:
o The detailed issues discussed in the forum are not always detected by the system, though they may be
relevant to the high-level topics prescribed for discussion within the activity.
o “Good / Bad” results for individual performance do not meet institutional feedback requirements.
Tutor (facilitator)
interviews
Facilitators will only adopt the service if it provides constructive guidance. They need aggregated data about
participation levels, suggestions for topics and clearer reporting of key issues to help plan interventions.
The service would need to be improved in its accuracy and usability to provide concise, objective feedback.
The service outputs take too much time to interpret and are not meaningful in their present format.
Tutor workshop(s)
Tutors see the value of a service to summarise discussion forum activity but have different information requirements
to that currently output by the service. They would like to see information about how well different groups are
performing, participation levels and critical issues, in a tutor dashboard.
Page 166 of 349
D7.4 - Validation 4
More summary data reported in a clearer interface across multiple groups would help tutors to identify intervention
points more easily. There is a perception that there is “too much” information delivered by the PolyCAFE service in its
current format.
The topics on which PolyCAFE provided feedback to students were not the same as those identified by tutors who
viewed the student outputs and tutors felt that unless there was a workflow to manage the release of feedback, they
would not trust the system.
Learner focus group 1
Students suggested that tab-based navigation might make the service simpler to use than having to select individual
service components from a list of widgets. They were keen that the service be integrated into the institutional VLE,
with reporting on individual forums at the thread level
Students were frustrated by the feedback because it didn‟t suggested how low scores could be improved. The use of
numerical scores was disliked as there was no legend as to how the numbers were derived. Trust in the numbers
was low. Words like “Bad” on feedback were not considered useful and students felt that if they received a “Bad”
grading it would not motivate them to contribute further.
Colour coding was suggested as one means of addressing and consolidating both numeric and text based feedback,
using some kind of traffic-light based system to identify areas in need of further development. This was considered by
all present to represent a good approach which would motivate people to engage to achieve the “Green light” status.
Learner focus group 2
Students were keen to use the system with the institutional VLE. They suggested that if processing was nearer to
real-time the results would motivate them to develop their responses more thoroughly.
The scoring mechanisms need to be simple, the cognitive load of using different scales of numeric and text data was
considered higher than reading through the forums. By changing the presentation format of the results to a simplified
single scale, the students thought it would probably help them to judge the results more easily and would have
greater potential to save time in identifying areas for further development.
Teaching manager
interview
System does not meet institutional policy requirements about feedback. Further work is needed to improve the format
in which the students are presented with the feedback from PolyCAFe.
“Blackboard is the institutional VLE at the University of Manchester. There are clear guidelines about student
feedback and this needs to address the requirements outlined in these guidelines. Statements such as “good” and
“bad” are not sufficiently justified in the software to provide a basis on which we could currently consider
implementing this on an institutional basis”
Page 167 of 349
D7.4 - Validation 4
Other: workshop with
the e-learning team
A workshop was held with members of the Faculty e-learning team (n=22). Results of the Likert question "How likely
are you to adopt PolyCAFe in your own educational practice?" were: mean 3.7, SD 0.95, 68% Agree/Strongly Agree
Other
PolyCAFe works on keywords, but concepts in professionalism often contain more than one word (Swiss Cheese
Model, personal hygiene)
The keywords we selected as relevant for the learning task didn't undergo any tuning. Related to this is the important
issue that we weren't 100% convinced that we were mapping our course to keywords well enough. It was really
difficult to know what to do in this case.
Transferability questionnaire: Relevance of the service in other pedagogic settings
Pedagogic setting
Reason(s)
Pedagogic settings for which the service would be
suitable:
Setting 1: Use of PolyCAFe together with chat or
forums for revising for exams.
Reasons: Students get feedback for their understanding of the topic under
discussion and suggestions for improvement.
Settting 2: Use of PolyCAFe together with chat or
forums for finding collaborative solutions to problems
that can be described without the importance of a
sequence of steps (PBL).
Reasons: Students get feedback for their understanding of the problem and of
the elements proposed as a solution.
Setting 3: Use of PolyCAFe together with chat or
forums to further investigate a given topic of interest to
the learner (Self Regulated Learning).
Reasons: Students can assess their understanding of the topic relative to their
peers in the chat, learn from their peers, see which peers are more
knowledgeable.
Pedagogic settings for which the service would be less
suitable:
Setting 1: Use of PolyCAFe together with chat or
forums in a setting that involves scripted collaboration.
Reasons: The scripts are not accounted by PolyCAFe in the analysis.
Page 168 of 349
D7.4 - Validation 4
Pedagogic setting
Reason(s)
Setting 2: Use of PolyCAFe together with chat or
forums in a setting that does not involve or require
collaboration (that is designed to be solved individually).
Reasons: PolyCAFe is designed to be used in collaboration focused settings.
Setting 3: Use of PolyCAFe together with chat or
forums in a setting where students do not collaborate
efficiently and they focus on providing long answers to
questions, without debating.
Reasons: PolyCAFe is designed to be used for collaborative discussions where
the posts/utterances are relatively short and the participants engage in active
discussions that also involve short messages and arguments.
Page 169 of 349
D7.4 - Validation 4
Transferability questionnaire: institutional policies and practices
Interoperability with the institutional LMS is needed (PUB-NCIT: Moodle; UNIMAN: BlackBoard).
Privacy concerns about the use of chat conversation and discussion forums of the students by the services could be overcome by anonymizing,
not showing information about students to peers without their consent, asking for students' consent before using the services.
Transferability questionnaire: Relevance of the service in other domains
Types of domain
Reason(s)
Types of domain for which the service would be
suitable:
Setting 1: All domains where textual descriptions of a
descriptive knowledge would be suited and sufficient
(little or no images, formulas, specific data that involves
certain numbers and procedural knowledge are
necessary): several areas of computer science,
literature, psychology, education, social and human
sciences.
Reasons: Linguistic technologies (NLP pipe, LSA, ontologies) do not account
for pictorial descriptions. Moreover, it is difficult to address procedural
knowledge, since the order of the steps for a procedure is crucial and difficult to
be analysed automatically (as stated before the sequence of steps is unknown > similar to scripts).
Types of domain for which the service would be less
suitable:
Setting 1: See above: several areas of geography,
medicine, mathematics, physics, engineering, etc.
Reasons: See above
Setting 2: All domains for which it is difficult to have a
large corpus of relevant text material.
Reasons: Needed for LSA training.
Page 170 of 349
D7.4 - Validation 4
Transferability questionnaire: Relevance of the service in other domains
Types of domain
Reason(s)
Types of domain for which the service would be
suitable:
Setting 1: All domains where textual descriptions of a
descriptive knowledge would be suited and sufficient
(little or no images, formulas, specific data that involves
certain numbers and procedural knowledge are
necessary): several areas of computer science,
literature, psychology, education, social and human
sciences.
Reasons: Linguistic technologies (NLP pipe, LSA, ontologies) do not account
for pictorial descriptions. Moreover, it is difficult to address procedural
knowledge, since the order of the steps for a procedure is crucial and difficult to
be analysed automatically (as stated before the sequence of steps is unknown > similar to scripts).
Types of domain for which the service would be less
suitable:
Setting 1: See above: several areas of geography,
medicine, mathematics, physics, engineering, etc.
Reasons: See above
Setting 2: All domains for which it is difficult to have a
large corpus of relevant text material.
Reasons: Needed for LSA training.
Page 171 of 349
D7.4 - Validation 4
Section 6: Conclusions
Validation Topics
OVT
Operational Validation Topic
Validated
unconditionally
Validated with
qualifications*
Not
validated
Qualifications to validation
UNIMAN
(English)
PUB (English): Some speech
acts have low precision and recall
which should be improved (or at
least further investigated).
UNIMAN (English): Speech acts
not sufficiently correct.
PVT1: Verification of accuracy of NLP tools
OVT1.1
The tutors/experts find that the speech acts
discovered in the conversation (chat or forum)
are correct.
PUB
(English)
OVT1.2
The tutors/experts find that the labels
corresponding to Garrison‟s community of
inquiry model in a forum are correct.
UNIMAN
(English)
UNIMAN (English): Accuracy
should be better to validate
unconditionally.
PUB (English): No forums at
PUB.
OVT1.3
The tutors/experts find that the scores assigned
to the utterances are correct.
PUB
(English)
OVT1.4
The tutors/experts find that the scores assigned
to the participants for a given concept and
globally are correct.
PUB
(English)
OVT1.5
The tutors/experts find that the PolyCAFe
correctly identifies the important (relevant)
concepts from the conversation.
PUB
(English)
PUB (English): Precision should
be a little better.
PVT2: Tutor efficiency
Page 172 of 349
D7.4 - Validation 4
OVT2.1
Tutors/facilitators spend less time preparing
feedback for learners compared with traditional
means.
PUB
(English)
UNIMAN
(English)
OVT2.2
It is easier (there is less cognitive load) for
tutors/facilitators to provide feedback using
PolyCAFe compared with just reading the
learners‟ online conversations.
PUB
(English)
UNIMAN
(English)
UNIMAN
(English)
UNIMAN (English): Visualisation
component is a valuable aid but
text / numerical feedback needs
further consideration in how it is
presented to make it meaningful
PVT3: Quality and consistency of (semi-)
automatic feedback OR information returned
by the system
OVT3.1
Tutors/facilitators perceive that the feedback
received from the system helps them prepare
feedback for learners.
PUB
(English)
OVT3.2
Learners perceive that the feedback received
from the system contributes to informing their
study activities.
PUB
(English)
OVT3.3
The feedback given by different tutors/facilitators
to the same student is more consistent using
PolyCAFe than without using it (there is more
homogeneity among the responses provided to
learners).
PUB
(English)
PUB (English): It seems that the
feedback style depends on the
tutor very much and the
consistency of the feedback was
only slightly changed.
OVT3.4
The feedback provided by tutors/facilitators after
using PolyCAFe is more extensive (higher
quality) than without using the system.
PUB
(English)
PUB (English): It seems that the
feedback style depends on the
tutor very much and the quality of
the feedback was only slightly
changed.
Page 173 of 349
UNIMAN
(English)
UNIMAN (English): The feedback
encourages individual reflection
but feedback needs to provide
clearer identification of areas for
improvement
D7.4 - Validation 4
PVT4: Making the educational process
transparent
OVT4.1
Using PolyCAFe, tutors/facilitators monitor the
learner‟s participation in online discussions
better: detect conversations with bad/good
collaboration, discover the coverage of concepts
of each participant and other differences
between learners.
PUB
(English)
UNIMAN
(English)
OVT4.2
The system provides learners with information
that helps them reflect better on their
performance as individuals and as group
members compared with traditional means.
PUB
(English)
UNIMAN
(English)
UNIMAN (English): Service
encouraged learners to reflect on
their performance but the quality
of feedback is inconsistent
OVT4.3
The visualization offers the users a better
understanding of chat conversations and
discussion forums
PUB
(English)
UNIMAN
(English)
UNIMAN (English): Learners
were more positive about the
visualisation than the facilitators.
PUB
(English)
PUB (English): Fully validated for
collaboration but not validated for
content
PUB
(English)
UNIMAN
(English)
PUB (English): The results are
not satisfying to validate
unconditionally as the average
scores were between 3.5-3.8/5.0
UNIMAN (English): Students felt
that the feedback did not provide
clear actions to remedy areas
identified as poorly performing
PVT5: Quality of educational output
OVT5.1
Learner performance in online discussions is
improved in the areas of content coverage and
collaboration when using PolyCAFe.
PVT6: Motivation for learning
OVT6.1
The direct feedback provided by the system
encourages learners to undertake further study
to address gaps in their coverage.
Page 174 of 349
D7.4 - Validation 4
PVT7: Organisational efficiency
OVT7.1
There is a saving in institutional resources
overall.
PUB
(English)
PVT8: Relevance
OVT8.1
The service meets one or more institutional
objectives
PUB
(English)
UNIMAN
(English)
PVT9: Likelihood of adoption
OVT9.1
Users were motivated to continue to use the
system after the end of the formal validation
activities
PUB
(English)
UNIMAN
(English)
UNIMAN (English): Average
scores between 3.0-3.5/5.0
OVT9.2
A high score was obtained in the generic
questionnaires (based on UTAUT: likelihood of
adoption by users).
PUB
(English)
UNIMAN
(English)
PUB (English): Average score
3.75
UNIMAN (English): Average
score 3.46
OVT9.3
Tutors attending a dissemination workshop give
high scores to the question 'how likely are you to
adopt the service?'
PUB
(English)
PVT10: Additional WP-specific VTs, related
to Unique Selling Points
OVT10.1
Learners are more involved and motivated with
the course by using PolyCAFe.
PUB
(English)
Page 175 of 349
PUB (English): the focus group
results are encouraging, but this
validation topic should be tested
more thoroughly/formally.
D7.4 - Validation 4
Exploitation (SWOT Analysis)
The objective you are asked to consider is: "PolyCAFe (v1.5) will be adopted in pedagogic contexts beyond the end of the project".
Strengths
The strengths of the system (v1.5) that would be positive indicators for adoption are:
PolyCAFe promotes learner reflection on their performance as individuals and as members of a group
Feedback from PolyCAFe has been shown to improve the collaborative skills of learners in online discussions
The institution requires less tutor time for feedback, support and grading.
PolyCAFe monitors the participation of the students makes the learning processes more transparent e.g. locating the
outliers, positioning individual learners in the peer group.
PolyCAFe contributes to improving the consistency of feedback between tutors, especially if PolyCAFe is used as a
start point when proving the manual feedback to the learners
The feedback appears to have motivational aspects for engaging students into their activity.
Weaknesses
The weaknesses of the system (v1.5) that would be negative indicators for adoption are:
The usability and poor guidance when using the system provides discomfort to users
It is difficult to interpret the results especially due to the high amount of information
The generic trust and the reliability of the system should be improved by improving accuracy
It is unclear whether students will accept that their grading goes beyond the submitted products, as PolyCAFe
enables the grading of the discussion processes during the production
PolyCAFe is best used for topics with small numbers of key words per concept
Difficulties for new sites in analyzing their situations in order to know which key words to enter into the system
Opportunities
The system has potential as follows:
It is the only software on the market that provides complex feedback for online conversations that focuses on
stimulating collaboration
While the use of CSCL (conversations) is becoming more and more popular to relieve the tutor burden, the PolyCAFe
software enables the tutors to monitor the individual contributions).
Page 176 of 349
D7.4 - Validation 4
The system might be used as a feedback standard for training tutors assess collaborative activities
There are a lot of situations when learners that participate in online discussions do not receive any feedback for their
productions, therefore the automatic feedback provided by PolyCAFe would be very valuable.
PolyCAFe could also be used as a starting point for a chat agent that offers live feedback for students involved in
chats and forums in order to motivate them or make them engage into a better collaborative discourse.
The use of Web2.0 in education often implies that there are a lot of textual outputs that are very small portions of text
similar to a chat conversation. PolyCAFe could be easily adapted for this task.
Threats
PolyCAFe may not be suitable for all chats/discussion forum situations. It is more suited to chats/forums where (1)
grading and/or detailed feedback is required, and (2) where one of the aims of the chat/forum is for social learning to
take place.
There may be change management issues in introducing automatic feedback / assessment systems into new
environments. Introduction of the system has to overcome concerns about changes in working practices and whether
PolyCAFe's output can be trusted
Some learners may feel uncomfortable being monitored during their collaboration
It may be difficult to integrate PolyCAFe with the IT architecture of the institution.
Institutions may be deterred from adopting PolyCAFe owing to the extent of initial training in interpreting the data.
Privacy issues might arise in some institutions when using the system to analyze the contributions of students
Overall conclusion regarding the likelihood of adoption of PolyCAFe Version 1.5:
PolyCAFe is an innovative service, first on the market, for providing complex feedback for online chats and forums used in collaborative learning
situations. It saves time for assessing the conversation for tutors, but it is best suited to specific pedagogic settings in which tutors analyze the
discussions and the students expect to receive feedback.
However, it is important to enhance the functionality of the system by improving the usability, providing better help and instructions of using and
interpreting the feedback. In order to be more convincing for users, PolyCAFe should be tested in more domains and various contexts in order to
prove its reliability and usefulness for the users.
Most important actions to promote adoption of PolyCAFe:
Page 177 of 349
D7.4 - Validation 4
Improve the interface and the usability of the system
Provide a better help and interpretation of the feedback
Transform the indicators provided as feedback into advice that guides the learners instead of assessing them
Extend PolyCAFe to various domains and another language where there is an interest in adopting the service
Use PolyCAFe‟s feedback live in a chat conversation in order to guide the learners‟ conversation
Improve the privacy of the system, by anonymization, user-alias management or by explicit sharing of the locus of control
Page 178 of 349
D7.4 - Validation 4
Section 7 – Road map
Based on the results and conclusions from validation, the LTfLL team has agreed that the following are the most important future
enhancements to the system in order to meet stakeholder requirements:
Most important:
1. Improving the interface by supporting the users with an easy overview/interpretation of the vast amount of data generated.
2. Improving the help section and providing illustrative examples of relevant collaboration patterns.
3. The feedback should be expressed in a more direct manner (like an advice), instead of just providing scores and indicators.
4. Integrate PolyCAFe into a chat environment in order to provide “live” feedback or to join as agent the discussion process.
5. Extending PolyCAFe for Romanian in order to be used for more domains at PUB-NCIT (for courses conducted wholly in Romanian).
Other:
Improving the visualization of the discussion threads (and enhancing the threads detection algorithms).
Implement an anti-spam mechanism.
Improve the privacy by using a better mapping between users and aliases used for anonymization.
Improve the semantic similarity measure by considering alternatives like Latent Dirichlet Allocation and ontologies.
Based on the results and conclusions from validation, the LTfLL team has agreed that the following are the most important changes to
the current scenario(s) of use in order to meet stakeholder requirements:
Most important:
1. Focus on the importance of the collaboration aspect of the task to be solved.
2. Ask the learners to be as involved as possible in the collaborative task.
3. Try to develop a method for the automatic evaluation of the degree of reaching the learning outcomes, which are specified by the teacher when
setting up the assignment. This way the usage scenario is extended to cases where teachers need to assess the degree of reaching/fulfilling a
Page 179 of 349
D7.4 - Validation 4
learning outcome for a chat or forum assignment (or for each participant in such a conversation).
4. Investigate using PolyCAFe for discussions that arise naturally in forums, chats or twits of Communities of Practices or learners that use
Web2.0 tools naturally (outside a formal education context).
5. As in certain situations it is difficult to determine the key concepts for the discussion beforehand, develop a scenario that uses PolyCAFe without
specifying them.
Other:
Automatic training of semantic spaces from online resources or books
Based on the results and conclusions from validation, the LTfLL team has agreed that the following are possible additional educational
contexts for future deployment:
Most important:
1. Formal education where it is important to have open collaborative online debates between several participants.
2. Collaborative discussions of online Communities of Practice that use forums or twits for communication (informal learning).
3. Language learning (foreign or native) for inexperienced learners that need to learn specific concepts by debates and examples.
4. Argumentation-based learning situations that use online discussions.
Based on the results and conclusions from validation, the LTfLL team has agreed that the following are the most important issues for
future technical research to enable deployment of language technologies in educational contexts:
Most important:
1. Because the pilot participants (especially the tutors) expressed concerns and alternative viewpoints on the meaning of good collaboration, it is
important to develop several alternative algorithms for assessing the degree of collaboration in a discussion and compare their results on a golden
standard in order to determine the best one.
2. Because the current implementation of PolyCAFe discovers a great number of implicit links, several elongated discussion threads have been
observed. Therefore, there is a need to improve the discussion threads partitioning algorithms in order to split these threads.
3. As various alternatives exist for measuring semantic similarities, it is good to also integrate alternative methods for computing the semantic
similarities between utterances by also considering supplements to LSA: pLSA, LDA, ontologies.
Page 180 of 349
D7.4 - Validation 4
4. Further extend the implicit links detection mechanism by studying newer algorithms for coreference resolution and RST parsers. These
algorithms should be especially designed for conversations with at least two participants because the discourse model and performance of
algorithms that work well on written texts (and are tested on the standard MUC and DUC corpora) do not have very good performance in
conversations.
5. Develop an algorithm to measure the coverage of a learning outcome within a discussion by comparing different machine learning and
evaluation techniques in order to be able to use PolyCAFe in this new learning scenario.
Roadmap - validation activities
Further validation planned for beyond the end of the project:
OVT10.1. Learners are more involved and motivated with the course by using PolyCAFe.
Claim (OVT):
Methodology:
Questionnaires, pre and post tests, comparison with control group, extended activity.
Objective (OVT):
Methodology:
Learners are knowledgeable in the domain of the course by using PolyCAFe.
Objective (OVT):
Methodology:
Test the accuracy of the feedback for more courses and topics in order to assess and improve the reliability of the service.
Questionnaires, pre and post tests, comparison with control group, extended activity.
Measurements
Page 181 of 349
D7.4 - Validation 4
Appendix B.5 Validation Reporting Template for WP5.2 (UPMF / CNED)
Section 1: Functionality implemented in Version 1.5 and alpha / beta-testing record
Brief description of functionality
Version
number of unit
Changes from Version 1.0
Code structure
1.5
In order to make maintenance easier, the structure of the code has been
cleaned up and reorganized. Additionally, these changed had to be done in
order for widgetization to be handled more smoothly.
R-LSA integration
1.5
LSA implementation requires a deprecated OS, which is a strong constraint for
system managers who will not dedicate one server to the system. This
coincidentally allowed us to improve the feedback algorithm and the response
time (measures have shown improvement comprised between 60 and 80%,
see D5.3)
Learner judgment on feedback
1.5
In order to make Pensum‟s role clearer to the learners (allow them to reflect on
their writing so as to understand the documents better and not fixing the
shortcomings of their synthesis to them) as well as the fact that semantic
analysis cannot be 100% accurate, the learners can question the feedback
provided and even justify it. If they use the functionality, it provides tutors
insight on the student‟s intent and can trigger valuable exchanges.
Severity management
1.5
This functionality allows the learner to handle the quantity of feedback provided
by the system, which can display only the most important issues or many of
them. Depending on what they are doing (first draft/last proof-reading), the
functionality can be an asset to the learner‟s organization of their work.
Version handling
1.5
All the successive versions of the synthesis are stored along with feedback and
learner actions on feedback. The learner will benefit from it, in that any action
Page 182 of 349
D7.4 - Validation 4
taken upon feedback will be propagated to later versions. Teachers and
researchers alike use it as a data source relatively to the learner‟s writing
process and response to feedback by the system.
Alpha-testing
Pilot
site
language
and
UPMF- French
Date of completion of
alpha testing:
End of September, 2010
Who performed the
alpha testing?
Philippe Dessus & Mathieu Loiseau
Beta-testing
Pilot site and language: UPMF – French
Has stakeholder validation taken place using the service embedded in Elgg (Yes/No/Partially):
No.
If „No‟ or „Partially‟, give reasons:
The Pensum Widget (v 1.6 and up) was being developed while the experimentation took place.
Widgetization required the extension of the in-depth restructuring of the code. It also required the
development of various new features (multiple vector space handling, text addition, etc.), which were not
relevant to the testing at hand. We were also in the process of reworking of some existing services (such
as sentence detection). We thought it best to have student work on the latest stable version of the
system.
Beta-testing performed by:
17 participants (learners, n = 9; tutors, n = 8).
beta testing environment (stand-alone service / integrated into Elgg):
Stand-alone service
Page 183 of 349
D7.4 - Validation 4
HANDOVER DATE: 12 October 2010
(Date of handover of software v.1.5 for validation)
Page 184 of 349
D7.4 - Validation 4
Section 2: Validation Pilot Overview
Note: The underlying goal of this pilot study is to test Pensum in real-world settings, with learners attending an existing ICT course at a
distance.Ethical requirements of French universities for experiments involving humans imply they are informed on the goals of the study, as well
as their full voluntariness in participating to the study, so they can leave the experiment at any moment.
This section only mentions one Pilot site, but we were in contact with another Pilot Site (CNED-University of Rouen, France), involving 51 bachelor
students in Educational Sciences attending the same e-learning course as CNED-Lyon. The CNED-Rouen students have been contacted to
participate as volunteers to this pilot study and only three of them performed the whole study. We thus decided to integrate them to the analysis of
the first Pilot site, as they were given the same task and attended exactly the same courses in the same conditions. We performed a first analysis
of the reasons why we had so few participants from CNED-Rouen involved in this study: one of us went to Lyon during the first students‟ meeting
in order to present the experiment and the task in the beginning of November whereas the CNED-Rouen students were contacted by e-mail only,
yielding a weaker involvement on the task. This point is taken into account in the “Most important actions” rubric (section 6 of this document).
NB Information about pilot sites, courses and participants has been transferred to Appendix A.3
Pilot task
Pilot site:
CNED-Lyon-Rouen
Pilot language: French
What is the pilot task for learners and how do they interact with the system?
51 participants preparing their Bachelor degree in Educational Sciences (CNED) were recruited to write two syntheses but only those who
answered to the questionnaire were considered. Four tutors (Master degree on Educational Sciences) were also recruited to manage learners.
Learners had to use Pensum at a distance in order to write out two syntheses of the first two parts of their ICT course. No length
constraints were given, and 10 days were left for the students to write out both syntheses.
They were asked to use Pensum to write a first synthesis and were randomly divided in two groups (first synthesis: experimental group, n
= 17; control group, n = 22). The experimental group was guided to Pensum whereas the control group was guided to a fake interface of Pensum,
without feedback and with very low level text formatting possibilities, in order to prevent them to use the advanced functionalities of a word
Page 185 of 349
D7.4 - Validation 4
processor. After writing the first synthesis, the control group was asked to use Pensum and the experimental group to use the fake Pensum to
have a counterbalanced design (second synthesis: experimental group, n = 16; control group, n = 11). Experimental results were analyzed on the
merging of the first and second experimental groups. Control group results were analyzed only for the first control group with participants who had
not previously used Pensum, because order effects were detected.
After having written each synthesis, learners were asked to fill out the LTfLL likert questionnaire. Finally, 12 learners agreed to participate
to an interview by phone.
What do the learners produce as outputs? Are the outputs marked?
All participants produced two syntheses corresponding to the first two exercises in their ICT lessons in a counterbalanced design (first summary
with Pensum, the second without it or vice versa). The first exercise provided was to summarize one document, whereas the second summarized
two documents.
The answers of the learners to the questionnaire were crossed with the traces of their activity on Pensum, in order to see whether the
extensiveness of their use of the system influences their judgement.
Finally, outputs were marked by expert teachers.
How long does the pilot task last, from the learners starting the task to their final involvement with the software?
10 days
How do tutors/student facilitators interact with the learners and the system?
Learners can send questions by e-mail to tutors. For the second synthesis, tutors can also use the notepad to share their comments with the
students.
Describe any manual intervention of the LTfLL team in the pilot:
None.
Experiments
Page 186 of 349
D7.4 - Validation 4
Name of experiment: Expert evaluation
Objective(s): Assess the quality of the Pensum‟s feedback compared to expert‟s feedback
Assess the output quality
Measure the time saved to correct Pensum‟s syntheses vs. word processor-based syntheses (written without feedback)
Details:
Participants: Seven teachers specialized in synthesis writing (associate professors and postdoctorate students in Linguistics) were recruited (6
females, 1 male) as experts to mark a random set of the written syntheses (with and without Pensum). One expert has been removed because he
didn‟t perform his task and another evaluated only the half of his syntheses.
Task: Four different syntheses were assigned to each expert: two writing with feedback (Pensum) and two writing without feedback as control
group (the fake Pensum interface). Only one synthesis was assigned to all experts to evaluate an inter-rater agreement score. They were asked to
mark synthesis to each of three levels: Concept present in the learners‟ syntheses, which are not present in the source text, gaps in coherence,
and concepts missing in the learners‟ syntheses that are present in the source text.
Analysis: Then we calculated the percentage of Pensum‟s Recall and Precision. The precision corresponds to the relevance of the information
given by the system. Recall corresponds to the feedback‟s accuracy compared to information identified by the expert.
The overall Recall and Precision rates correspond to taking into account all experts and syntheses but it is possible to analyse Recall and
Precision synthesis by synthesis to evaluate the matching between Pensum‟s feedback and expert‟s feedback. The experts were also asked to
grade the syntheses, in order to try to evaluate the potential gain in quality of student outputs provided by Pensum.
Name of experiment: Dissemination Workshop
Objective(s): Promote Pensum to a larger number of likely tutors
Evaluate Pensum‟s acceptance within this group
Collect tutors‟ viewpoints and suggestions about Pensum
Details:
Participants
Page 187 of 349
D7.4 - Validation 4
27 participants (including 6 teachers) took part of our workshop.
Workshop
The workshop took place during a French congress dedicated to Internet innovations and notably to the web 2.0-based education. We were
invited to present the LTfLL project and more precisely Pensum. First, we introduced the LTfLL project and Pensum to participants (10 min),
following a demonstration of Pensum’s widget in writing a synthesis (10 min). Second, we conducted a debate based on the LTfLL focus group
template (10 min). Finally the dissemination workshop questionnaire was distributed (filled out and reclaimed after the workshop). Nevertheless
only 6 tutors (out of 27 who attended the workshop) agreed to answer to the dissemination questionnaire.
Page 188 of 349
D7.4 - Validation 4
Section 3: Results - Validation/verification of Validation Topics
OVT 1.1
Pilot site
CNEDLyonRouen
Pilot language
French
Operational Validation Topic
According to tutors, in a high proportion of cases, the feedback presented by the system correctly
identifies the concepts missing in the learners‟ syntheses that are present in the source texts
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Calculation of Precision and Recall of the Feedback (expert‟s experiment)
Results: 22 syntheses were evaluated by Experts [4 syntheses by tutors (5) and 2 syntheses for the last expert]
Numbers of feedback given by Pensum were counted, as those given by experts. Common feedback is feedback given both by Pensum and by
Experts.
Feedback
Pensum
Expert
Common
Mean feedback
by synthesis
5.6
15.4
1.9
In average, Pensum identify less concepts missing than experts do, and those detected are sometimes different than those the experts detect.
Thus the overall Recall is 12% and the overall Precision is 34%.
On average, tutors‟ (n = 4) opinion to the question of missing concepts identification (Q. 34: “the feedback presented by the system correctly
identifies concepts missing”) is negative (M = 1.5; SD = 0.58).
Page 189 of 349
D7.4 - Validation 4
OVT 1.2
Pilot site
CNEDLyonRouen
Pilot language
French
Operational Validation Topic
According to tutors, in a high proportion of cases, the feedback correctly identifies concept present in the
learners‟ syntheses, which are not present in the source texts.
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Tutors (4) questionnaire and by calculating the Precision and Recall of the Feedback
Results:
Feedback
Pensum
Expert
Common
Mean feedback
by synthesis
1.6
1.1
0.4
In average Pensum identify more concepts present than experts do.
The overall Recall is 39% and the overall Precision is 26%.
On average, tutors‟ (n = 4) opinion on the question of the off-topic identification (Q. 35: “the feedback correctly identifies concept present in the
syntheses but not in the source texts”) is negative (M = 2.3; SD = 1.26).
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
“Off-topic detection yields relative performance, it depends on the instructions and the expected length of the text.”
Page 190 of 349
D7.4 - Validation 4
OVT 1.3
Pilot site
CNEDLyonRouen
Pilot language
French
Operational Validation Topic
According to tutors, in a high proportion of cases, the feedback correctly identifies gaps in the coherence of
the learners‟ syntheses.
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Tutors (4) questionnaire and by calculating the Precision and Recall of the Feedback
Results: See OVT 3.12
Feedback
Pensum
Expert
Common
Mean feedback
by synthesis
20.9
1.8
1.5
In average Pensum identify more gaps in the coherence than experts do.
The overall Recall is 85% and the overall Precision is 7%.
On average, tutors‟ (n = 4) opinion to the question of the gaps in coherence identification (Q. 36: “the feedback correctly identifies gaps in the
coherence”) is negative (M = 2.3; SD = 1.26).
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
“A lot of gaps in coherence”,
Tutors
“There are many gaps in coherence identified by Pensum but often unjustified”
Tutors
“The low reliability of coherence gaps feedback calls into question its inclusion in the software”
Page 191 of 349
D7.4 - Validation 4
OVT 2.1
Pilot site
CNEDLyonRouen
Pilot language
French
Operational Validation Topic
The tutor spends less time preparing feedback compared to traditional means.
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Tutors (4) questionnaire, communication between learners and tutors, and time spent by experts to mark the
synthesis.
Results: Number of e-mails received and sent by tutors per learner, for the first syntheses‟ task.
Group
Experimental
Control
Nb
total e-mail
Received e-mail total e-mail Sent e-mail per
participant received by tutors
per learner
sent by tutors
learner
33
41
1.2
39
1.2
22
47
2.1
46
2.1
Since each tutor of the experimental group received and sent less e-mails than those of the control group, we can infer that experimental group‟s
tutors spent less time to prepare feedback compared to traditional means.
Only one tutor interacted with their learners through the notepad.
Group
Experimental
Control
N experts
6
6
N synthesis
11
11
Mean time for correction in min (SD)
20.1 (6.1)
22.9 (11.2)
In average each expert spent 20.1 min to assess Pensum‟s pre-assessed syntheses (n = 7 experts; n= 2 syntheses by expert) whereas each
expert spent 22.9 min to assess control syntheses (n = 7 experts; n= 2 synthesis by expert). Note: 1 expert has been removed (work not done) and
1 expert has corrected only one of each.
Moreover, there is moderate correlation (r = –.63; p = .05) between the time spent to mark syntheses and the number of requested feedback. This
shows that the more learners ask for feedback, the less time is spent to mark the syntheses.
Page 192 of 349
D7.4 - Validation 4
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean (/5)
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
7. It takes less time to complete my teaching tasks using Pensum than
without the system.
Experimental
2.8
1.26
25%
4
Tutors
8. Using Pensum enables me to work more quickly than without the
system.
Experimental
2.5
1.29
25%
4
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
“The assessment with Pensum‟s interface is longer because there are not enough tools like arrows, colours or other visual
cues to insert in the synthesis”.
Tutors
“For long lessons or with various documents, Pensum can help us save time.”
Tutors
“But it facilitates education especially for students who have not synthesized texts long ago”
OVT 2.2
Pilot site
CNEDLyonRouen
Pilot language
French
Operational Validation Topic
It is easier (there is less cognitive load) for tutors to provide feedback using Pensum compared with just reading
learner texts
Page 193 of 349
D7.4 - Validation 4
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Tutors (4) questionnaire
Results: Only questionnaire-related results.
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean (/5)
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
11a. Rate mental effort (reversed scale)
Experimental
3.5
1.0
75%
4
Tutors
11b. System requires less mental effort
Experimental
3.0
1.83
50%
4
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
“Improvements must be made in the communication between the tutor and the learner”
Tutors
The interface is “not satisfactory”,
Tutors
It needs ”Clues about the of learners‟ certainty” on the relevance of their synthesis
Tutors
It needs “A help button”
Tutors
It needs “A presentation of tutor's remarks in parallel to synthesis displays”
OVT3.1
Pilot site
CNED-LyonRouen
Pilot language
French
Operational Validation Topic
The teacher„s activity shifts towards providing more advanced feedback.
Summative results with respect to validation indicator
Page 194 of 349
D7.4 - Validation 4
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Tutors (4) questionnaire
Results: Only questionnaire-related results
Questionnaire
type
Questionnaire no. & statement
Experimenta
l / control
group
Mean (/5)
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
44. To what extent do you think your job as a teacher/tutor has been
transformed by the fact that when students used Pensum, it gave them
pieces of feedback on their work.
Experimental
3.0
0.82
25%
4
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
“If the learner is not motivated in his task, Pensum is as useful as paper-pencil.”
Tutors workshop
“Automatic systems are needed to unburden teachers of repetitive tasks but it should not replace the teacher ”
OVT3.2
Pilot site
CNED-LyonRouen
Pilot language
French
Operational Validation Topic
The feedback given is more consistent than that of different tutors (there is more homogeneity among the
responses provided to learners).
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Learners (33) questionnaire
Results: Only questionnaire-related results
Questionnaire
type
Experimental
/ control
group
Questionnaire no. & statement
Page 195 of 349
Mean (/5)
Standard
deviation
%Agree /
Strongly
agree
n=
D7.4 - Validation 4
Learners
31. I perceive the feedback as more consistent than that of tutors
Experimental*
2.1
0.8
0%
33
OVT 3.3
Pilot site
Pilot language Operational Validation Topic
Learners find the feedback given by the system is mostly correct.
CNEDFrench
LyonRouen
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Learners (33) questionnaire
Results: Only questionnaire-related results
Questionnaire
type
Questionnaire no. & statement
Experimental /
control group
Mean (/5)
Standard
deviation
%Agree /
Strongly
agree
n=
Learners
32. I find the feedback given by the system is correct in more than
75% of the cases
Experimental*
2.8
1.06
27%
33
*question for experimental group only
Formative results with respect to validation indicator
Stakeholder type
Results
Learners
“We don‟t know why what we do is wrong, there is no comment.”
Learners
The feedback is “sometimes right, sometimes wrong”.
Learners
The feedback “creates misunderstandings”.
Learners
Feedback is “not clear enough and not quite readable”.
Page 196 of 349
D7.4 - Validation 4
OVT 3.4
Pilot site
Pilot language Operational Validation Topic
Learners find the feedback given by the system is relevant to (i.e. useful to them in) the task in hand.
CNEDFrench
LyonRouen
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Learners (33) questionnaire and trace analysis
Results: Questionnaire and correlations between answers and trace-related data.
Questionnaire
type
Questionnaire no. & statement
Experimental /
control group
Mean (/5)
Standard
deviation
%Agree /
Strongly
agree
n=
Learners
33. I find the feedback given by the system is relevant to (i.e. useful to
them in) the task in hand.
Experimental*
2.9
1.01
27%
33
*question for experimental group only
The correlation between Q. 33 and the number of requested feedback is r = 0.42, p < .05, showing a moderate but significant relation between the
opinion of relevance and the actual use of the feedback
Formative results with respect to validation indicator
Stakeholder type
Results
Learners
Pensum helps to “read more precisely the course” and to “memorize more things and (...) retain the key-passages”
Learners
“The feedback is not fine enough”, “not clear enough”, and “not explicit enough”.
Page 197 of 349
D7.4 - Validation 4
OVT 3.5
Pilot site
Pilot language Operational Validation Topic
Learners trust the feedback provided by Pensum .
CNEDFrench
LyonRouen
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Learners interview
Results: Formative results only.
Formative results with respect to validation indicator
Stakeholder type
Results
Learners
“To me, feedback can identify the mistakes and they are often justified, it was a help and it is obvious that feedback can‟t be
perfect”
Learners
“I‟m often disagree with the feedback, my synthesis seemed OK for me but not for Pensum.”
Learners
“ I had no confidence in the feedback”
Page 198 of 349
D7.4 - Validation 4
OVT 3.6
Pilot site
CNEDLyonRouen
Pilot language
French
Operational Validation Topic
The tutors find the feedback given by the system at the right level considering the task in hand in a high
proportion of cases.
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Tutors (4) questionnaire
Results: Questionnaire-based results only.
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean (/5)
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
41. The gap in coherence feedback helps learners write readable
synthesis
Experimental
3.3
0.96
50%
4
Tutors
42. The off-topic feedback helps learners write syntheses with no or few
off-topic
Experimental
3.8
0.5
75%
4
Tutors
43. The concept missing feedback helps learners write syntheses with
only important concepts present in the source text
Experimental
3.0
1.15
50%
4
OVT 3.7
Pilot site
CNEDLyonRouen
Pilot language
French
Operational Validation Topic
Tutors perceive that the feedback from Pensum provides a reliable source of information about learners'
conceptual coverage.
Summative results with respect to validation indicator
Page 199 of 349
D7.4 - Validation 4
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Tutors (4) questionnaire
Results: Questionnaire-related results only.
Questionnaire type
Questionnaire no. & statement
Experimental
/ control
group
Mean (/5)
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
45. The gap in coherence feedback gave me a reliable source of information
about learners' conceptual coverage.
Experimental
2.8
1.71
25%
4
Tutors
46. The off-topic feedback gave me a reliable source of information about
learners' conceptual coverage.
Experimental
2.8
0.96
25%
4
Tutors
47. The concept missing feedback gave me a reliable source of information
about learners' conceptual coverage.
Experimental
3.3
0.96
50%
4
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
“The feedback is not explicit enough on what Pensum expects from learners”.
Tutors
“Repetitions induced by Pensum‟s feedback and by the necessary changes to correct them provide a better learning and
highlight learner‟s essential ideas.”
OVT 4.1
Pilot site
CNEDLyonRouen
Pilot language
French
Operational Validation Topic
Learners can receive feedback whenever they want
Summative results with respect to validation indicator
Page 200 of 349
D7.4 - Validation 4
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Learners (33) questionnaire
Results: Questionnaire-related results only.
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean (/5)
Standard
deviation
%Agree /
Strongly
agree
n=
Learners
10. Pensum provides me with the requested information when I require
it (i.e. at the right time in my work activities).
Experimental*
3.2
0.88
30%
33
*question for experimental group only
Formative results with respect to validation indicator
Stakeholder type
Results
Learners
“The feedback should be automatic.”
OVT 5.1
Pilot site
CNEDLyonRouen
Pilot language
French
Operational Validation Topic
The textual output that is handed over to the teacher is of better quality.
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Comparison of experts‟ marks and trace analysis
Results:
Group
Experimental
Control
N experts
N syntheses
Mean scores /20 (SD)
7
7
13
13
12.33 (3.6)
12.25 (3.4)
Page 201 of 349
D7.4 - Validation 4
The average of Pensum‟s synthesis are marked 12.33/20 whereas the average of Control‟s synthesis are marked 12.25/20. Nevertheless the
difference is not statistically significant, F(1,26) = 0.0002, ns.
Additionally the syntheses of learners who requested feedback are not significantly better than those who did not (r = 0.25, ns.)
Questionnaire
type
Questionnaire no. & statement
Experimental /
control group
Mean (/5)
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
40. Better output with the system
Experimental
2.5
1.73
25%
4
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
“It needs a human behind; otherwise Pensum is similar to a word processor.”
OVT 6.1
Pilot site
Pilot language Operational Validation Topic
The direct feedback provided by the system encourages learners to undertake further study to address gaps in
CNEDFrench
their coverage.
LyonRouen
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Learners (experimental group, n = 33; control group, n = 22) questionnaire and trace analysis
Results: Questionnaire and correlations with trace-related data.
Questionnaire
type
Questionnaire no. & statement
Learners
18. Using Pensum increases my curiosity about the learning topic.
Learners
19. Pensum makes learning more interesting.
Page 202 of 349
Experimental
/ control
group
Mean (/5)
Standard
deviation
%Agree /
Strongly
agree
n=
Experimental
3.1
1.18
45%
33
Control
3.3
1.21
36%
22
Experimental
2.9
0.95
27%
33
D7.4 - Validation 4
20. Using Pensum motivates me to explore the learning topic more fully.
Learners
Control
3.0
1.25
31%
22
Experimental
3.1
1.17
36%
33
Control
3.2
1.41
41%
22
Correlations between the number of distinct feedback items provided by Pensum over the course of the learner's activity and respectively Q. 18
(Pensum increases my curiosity), Q. 19 (Pensum makes learning more interesting), and Q. 20 (Pensum motivates me to explore the learning topic
more fully) are respectively r = 0.44, p < .01; r = 0.40, p < .05; r = 0.38, p < .05 . These correlations show a moderate but significant relation
between the motivation to use Pensum and the actual use of the feedback.
Formative results with respect to validation indicator
Stakeholder type
Results
Learners
“I did not want to explore more the course because I think I've a look at this method.”
OVT
7.1
Pilot site
CNEDLyonRouen
Pilot language
French
Operational Validation Topic
There is a saving in institutional resources overall
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Managers (n = 3) interview, communication between learners and tutors, and time spent by experts to mark the
synthesis.
Results: Questionnaire, communication and experts results.
As highlight above, Pensum allows to clarify the communication between learners ant tutors. Additionally, the use of Pensum allows to save time
spent to mark the synthesis (see OVT 2.1).
Manager # 1 indicated that the main quality of Pensum is not especially to help save time or money, but rather to enable the institution to be more
aware on the competences of the learners, and their own understanding of the course material.
Manager # 2 indicated that the three kinds of feedback offered to learners allow them to learn to write a summary or synthesis by identifying highly
Page 203 of 349
D7.4 - Validation 4
relevant elements. The competence of writing being a core competence. In addition he brings out the time saved in the correction of synthesis and
the planning of a course. Finally, Pensum would enable learners to become more involved in their learning. However, in its current state,
improvements should be added especially in the production of the text quality. Although the three feedback are not sufficient, there are very
important information that are not taken into account as could do a linguistic feedback on spelling, grammar, syntax or structure.
Manager # 3 indicated that Pensum can save resources if used with tutors or teachers involved in learners guidance within the learning
environment.
Page 204 of 349
D7.4 - Validation 4
OVT 8.1
Pilot site
CNED-LyonRouen
Pilot language
French
Operational Validation Topic
The service meets one or more institutional objectives.
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Managers (3) interview
Results: Questionnaire-only results
Manager # 1 indicated that Pensum would enable the institution to solve two main problems with regard to the learners‟ competence assessment.
First, with Pensum learners would be assessed on their positioning with regard to their understanding of a course, as well as with regard to
competences referential. Second, as the institution has a lot of “minutes” documents from meetings, Pensum could enable learners to check their
understanding, both for people who attended the meeting either for other ones.
Manager # 2 indicated that one of the goals of his institution is the integration of ICT in education, the simplicity of the software and its use are a
decisive advantage in its integration in his institution.
Manager # 3 indicated that Pensum can solve three pedagogy-focused goals: improve the quality of teachers‟ productions (as authors), the quality
of learner‟s understanding of course contents as well as the achievement rate of exams (notably transmissive ones, literature-based).
OVT
9.1
Pilot site
Pilot language Operational Validation Topic
Users were motivated to continue to use the system after the end of the formal validation activities
CNEDFrench
LyonRouen
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Learners (experimental group, n = 33; control group, n = 22), tutors (4) questionnaires, and correlation with trace
analysis
Results: Questionnaire-based results and trace data analysis
Page 205 of 349
D7.4 - Validation 4
Page 206 of 349
D7.4 - Validation 4
Questionnaire
type
Questionnaire no. & statement
Learners
29. I would like to use the service after the pilot.
Tutors
30. I would like to use the service after the pilot.
Learners
30. If the service is available after the pilot, I will definitely use it
Tutors
31. If the service is available after the pilot, I will definitely use it
Experimental
/ control
group
Mean (/5)
Standard
deviation
%Agree /
Strongly
agree
n=
Experimental
3.3
1.23
45%
33
Control
3.9
1.31
68%
22
3.0
1.41
50%
4
Experimental
3.2
1.29
42%
33
Control
3.6
1.26
55%
22
Experimental
1.8
0.96
0%
4
Moreover, the correlation between Q. 30 (learner experimental group) and the number of requested feedback is r = 0.412, p < .05, showing a
moderate but significant relation between the learners‟ likelihood of use of Pensum in the future and their actual use of the feedback.
Finally, after the pilot, over the 51 with login and password for Pensum only two CNED students used it to synthesize another part of their ICTE
course not included in our experiment.
Formative results with respect to validation indicator
Stakeholder type
Results
Learners
“I will be motivated to use Pensum again, if improvements are made.”
OVT 9.2
Pilot site
CNEDLyonRouen
Pilot language
French
Operational Validation Topic
A high score was obtained in the generic questionnaires (based on UTAUT: likelihood of adoption).
Summative results with respect to validation indicator
Page 207 of 349
D7.4 - Validation 4
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Learners (experimental group, n = 33) questionnaire
Results: Questionnaire-based results
Descriptive Statistics - Learners
N
Mean
Std. Deviation
Effectiveness
33
2,91
,833
Efficiency
33
2,85
,643
Cognitive load
33
2,52
1,149
Usability
33
3,76
,812
Satisfaction
33
3,03
,909
Facilitating conditions
33
3,73
,911
Self-Efficacy
33
3,43
,860
Behavioural intention
33
3,18
1,249
CNED
33
3,21
,656
Valid N (listwise)
33
OVT 9.3
Pilot site
CNEDLyonRouen
Pilot language
French
Operational Validation Topic
Tutors attending a dissemination workshop give high scores to the question 'how likely are you to consider
adopting the service in your own educational practice?
Summative results with respect to validation indicator
Page 208 of 349
D7.4 - Validation 4
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Tutors dissemination (6) questionnaire
Results: Questionnaire-related results only.
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean (/5)
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
workshop
How likely are you to consider adopting the service in your own educational
practice?
Experimental
2.3
1.36
33%
6
Page 209 of 349
D7.4 - Validation 4
Section 4: Results – validation activities informing future changes / enhancements to the system
VALIDATION
ACTIVITY
Pilot partner: CNED Lyon & Rouen
Service language: French
Additional formative results (not associated with validation topics)
Alpha testing
Beta testing
Minor interface changes (icons, size of fields, etc)
Learners and tutors suggestions from Verification study (most frequent suggestions from the poll):
Interface changes:
Possibility to change font, highlighting, size of text…
Possibility to view the feedback from within the writing mode
Propose more extensive explanations on each feedback and solutions to improve the synthesis
From focus group
Make the feedback prompts viewable from within the synthesis writing field.
Propose more extensive explanations on each piece of feedback and efficient solutions to improve the synthesis.
The enhancement of the interface for writing and reading syntheses (font, highlighting, text size).
Tutor interviews
General suggestions:
Enhance the layout (line breaks, paragraphs, ...) and don‟t let Pensum remove the existing one.
Highlight the feedback
Additional functionalities
Highlight and bookmarking function, spelling, synonym, syntax link
More feedback, keywords, tooltips
A warning button that warns tutors a learner needs help
Clues about the of learners‟ certainty on the relevance of their synthesis
Page 210 of 349
D7.4 - Validation 4
Tutor workshop(s)
A dictionary of synonyms and a thesaurus
Pensum could take into account the meaning of logical connectors.
Learners Interview
(prioritisation of
enhancements)
Learners judged that the five most important areas for enhancement of the system are:
To improve the feedback (33%)
To enhance the layout of Pensum (25%)
To improve the account of the syntax of syntheses (8.3%)
To allow the highlighting of the course/synthesis (8.3%)
The ability to link several sentences to each other (8.3%)
To improve the compatibility with the web browser (8.3%)
To improve the interactivity with the tutor (8.3%)
Teaching manager
interview
A comprehensive help has to be provided to users.
Integration of linguistic feedback: spelling, grammar, syntax or structure
Page 211 of 349
D7.4 - Validation 4
Section 5: Results – validation activities informing transferability, exploitation and barriers to adoption
VALIDATION
ACTIVITY
Partner(s) involved: CNED Lyon & Rouen
Service language: French
Additional formative results (not associated with validation topics)
Alpha testing
Beta testing
Tutor interviews
Major issues encountered in transferring Pensum to the ICT domain:
o All the source texts have to be transferred manually: no automatic “import” procedure to add course texts in
the database.
Internet Explorer is not usable. Warn the user (or better, test) Firefox or Safari must be used. Moreover, give learners
advice to use FireFox during the Pilot study.
Reason to adoption
Even though the reliability of feedback is not optimal, Pensum allows learners to re-read the course and the
synthesis carefully and, thereby, to learn better in this way.
Barrier to adoption
Not enough tools like arrows, colours or other visual cues.
Tutor workshop(s)
Reason to adoption
The teachers can focus on higher level tasks while Pensum handles lower ones
Support to carry out competence improving exercises
Barriers to adoption
Learners who rephrase the course are more disadvantaged than those who merely paraphrase it
Teachers already perform the tasks Pensum promotes, so they are reluctant to transfer their work to a machine.
Learners interview
Reason to adoption
Allow to deepens their synthesis when they failed relevant elements in the source text
Allow to limit the off-topic when they tend to write too
Page 212 of 349
D7.4 - Validation 4
A better analysis of the source text
Barriers to adoption
Pensum takes time to learn how to use it
Pensum takes time to check all the prompts and to revise the synthesis accordingly
The layout is not saved
No highlight
It doesn‟t take into account abbreviations
Not enough place to write synthesis
Teaching manager
interview
Reason to adoption
The three types of feedback help to correct the mistakes in order to work towards the improvement of the
synthesis.
Barriers to adoption
Unless a comprehensive help is offered to the user, some functionalities (e.g., feedback rejection, feedback
tolerance tuning) are not so easy to understand in the current version of Pensum (Manager #1).
Unexpected technical issues
Tutors and teachers implication (who actually help learners to use the tool and answer their questions) might be
improved with a fully integrated specific interface to manage learners and courses easier.
Transferability questionnaire: Relevance of the service in other pedagogic settings
CNED is using a blackboard/webCT platform (see http://www.sciencedu.org/), thus there are difficulties to integrate our service in this platform.
Transferability questionnaire: Relevance of the service in other pedagogic settings
Pedagogic setting
Reason(s)
Page 213 of 349
D7.4 - Validation 4
Pedagogic setting
Reason(s)
Pedagogic settings for which the service would be
suitable (see also D5.3 § 5.3 for more information):
Use of Pensum for revising exams.
Use of Pensum for students to get ideas from source
texts before a debate (using chat) on a given topic.
reasons: students can get a good overall view of the course texts and can also
get feedback on their understanding.
reasons: students can work on a topic without being influenced by others.
Ideas are more structured than a mere reading.
Use of Pensum to work on a case study.
reasons: Studying in-depth cases is fostered by their reformulation (synthesis
writing).
Use of Pensum to trigger a debate. The participants
write out their own opinion on a topic.
reasons: As for a case study, a debate runs smoothly when all the participants
clearly understand the different questions and issues on a given topic.
Pedagogic settings for which the service would be less
suitable:
setting 1: Problem-based learning. Students work to
solve a given problem in order to learn a domain.
reasons: PBL entails a very structured procedure (steps toward solution) in
order to solve the problem. This procedure is not taken into account in our
service.
Transferability questionnaire: Relevance of the service in other domains
Types of domain
Reason(s)
Page 214 of 349
D7.4 - Validation 4
Types of domain for which the service would be
suitable:
setting 1: all domains with textual descriptions of a
descriptive knowledge would suit (no images, no
formulas, no procedural knowledge): literature,
psychology, education, social and human sciences.
reasons: LSA doesn‟t account for pictorial descriptions. Moreover, difficulties to
address procedural knowledge, since the order of the steps for a procedure is
crucial and difficult to be analysed by LSA (bag of words approach).
setting 2: use Pensum to work on the minutes of a
meeting. The minutes are in the course part, and the
synthesis part is what each participant actually
remembers about the meeting. The feedback enables
to highlight the discrepancies between the minutes and
what each participant has in mind after the meeting.
This would allow to enhance the quality of the decisions
made during the meeting.
reasons: The task to agree with the minutes of a meeting is very close to the
comprehension of a source text. A bad comprehension of some items of the
minutes may yield their rephrasing.
Types of domain for which the service would be less
suitable:
setting 1: see above: cartography-based geography,
some medicine-oriented domains, etc.
setting 2: all domain in which no large general language
corpus exists.
reasons: see above.
Page 215 of 349
D7.4 - Validation 4
Section 6: Conclusions
Validation Topics
OVT
Operational Validation
Topic
Validated
unconditionally
Validated with
qualifications*
Not
validated
N/A
Qualifications to validation
PVT1: Verification of
accuracy of NLP tools
OVT1.1
According to tutors, in a high
proportion of cases, the
feedback presented by the
system correctly identifies
the concepts missing in the
learners‟ syntheses that are
present in the source texts.
CNEDLyon/Rouen
(UPMF)
Though the results are low on this
OVT, they should be moderated by
the fact that the task requires a high
cognitive
charge
and
heavily
depends on the synthesis length.
Moreover the task forces the tutors
to analyse the text sentence by
sentence whereas they usually work
with larger text units.
OVT1.2
According to tutors, in a high
proportion of cases, the
feedback correctly identifies
concept present in the
learners‟ syntheses, which
are not present in the source
texts.
CNEDLyon/Rouen
(UPMF)
The off-topic detection shows better
results than the previous function. A
difficulty in the task which is inherent
to self-regulated learning is that the
learner decides of the length
constraints of his/her summary. This
makes the task of the tutor regarding
off-topic more difficult (cf. expert
quote).
OVT1.3
According to tutors, in a high
proportion of cases, the
feedback correctly identifies
CNEDLyon/Rouen
(UPMF)
The coherence feedback seems to
be over generating which is partly
due to sentence extraction algorithm.
Page 216 of 349
D7.4 - Validation 4
OVT
Operational Validation
Topic
Validated
unconditionally
Validated with
qualifications*
Not
validated
N/A
Qualifications to validation
gaps in the coherence of the
learners‟ syntheses.
PVT2: Tutor efficiency
OVT2.1
The tutor spends less time
preparing feedback.
OVT2.2
It is easier (there is less
cognitive load) for tutors to
provide feedback using
PENSUM compared with
just reading learner texts
CNEDLyon/Rouen
(UPMF)
CNEDLyon/Rouen
(UPMF)
Because the display must be
improved
ergonomically,
tutors
indicate some cognitive load using
Pensum (The Likert scale reported in
the question 11a is in reverse order).
PVT3: Quality and
consistency of (semi-)
automatic feedback OR
information returned by
the system
OVT3.1
The teacher„s activity shifts
towards providing more
advanced feedback.
OVT3.2
The feedback given is more
consistent than that of
different tutors (there is
more homogeneity among
the responses provided to
learners).
CNEDLyon/Rouen
(UPMF)
CNEDLyon/Rouen
(UPMF)
Page 217 of 349
Insufficient evidence to categorise.
The overall low precision and recall
rates
prevented
users
from
assessing feedback consistency.
What was assessed here was
feedback relevance (the results are
consistent with those for OVT 1.1 to
D7.4 - Validation 4
OVT
Operational Validation
Topic
Validated
unconditionally
Validated with
qualifications*
Not
validated
N/A
Qualifications to validation
1.3).
OVT3.3
Learners find the feedback
given by the system is
mostly correct.
CNEDLyon/Rouen
(UPMF)
As indicated above, the overall
Precision rate not quite satisfactory
(34%, 26%, and 7%), consequently
learners don‟t find the feedback
correct. These results also explain
why learners don‟t trust the
feedback.
OVT3.4
Learners find the feedback
given by the system is
relevant to (i.e. useful to
them in) the task in hand.
CNEDLyon/Rouen
(UPMF)
Even if learners indicate that
Pensum is a help to learn their
lessons, to them the feedback could
be clearer and more precise.
Nevertheless the correlation between
the opinion on feedback relevance
and the number of requested
feedback indicate that the more
learners requested feedback, the
more they find it most relevant to the
task in hand. This result allows us to
hypothesize that taking advantage of
the feedback is tightly linked with the
learners' understanding of the
system, which requires extensive
use."
OVT3.5
Learners trust the feedback
provided by Pensum.
CNEDLyon/Rouen
(UPMF)
Page 218 of 349
Even if learners can differentiate
erroneous feedback, learners doubt
of their validity because too much
errors are prompted.
D7.4 - Validation 4
OVT
Operational Validation
Topic
Validated
unconditionally
Validated with
qualifications*
Not
validated
N/A
Qualifications to validation
OVT3.6
The tutors find the feedback
given by the system at the
right level considering the
task in hand in a high
proportion of cases.
CNEDLyon/Rouen
(UPMF)
Likert not convincing enough for fully
validated.
These results concern the adequacy
of the feedback level, not its
reliability (see OVT 1.1 through 1.3).
OVT3.7
Tutors perceive that the
feedback from Pensum
provides a reliable source of
information about learners'
conceptual coverage.
CNEDLyon/Rouen
(UPMF)
It seems that tutors find a difference
between the feedback given by
Pensum and what one can infer from
it
about
learners
conceptual
coverage. It is worth mentioning that
the 'missing concept feedback'
scores are perceptibly higher than
the two other types of feedback on
this link, which is understandable as
the other two types of feedback are
less directly linked with conceptual
coverage, even though we believe
they contribute to the overall quality
of the work. Even though, the other
two types of feedback do not seem
to provide sufficient information on
conceptual coverage, question Q47
offers respectable acceptance by the
tutors.
PVT4: Making the
educational process
transparent
OVT4.1
Learners can receive
CNED-
Page 219 of 349
Learners
can
get
just-in-time
D7.4 - Validation 4
OVT
Operational Validation
Topic
feedback whenever they
want
Validated
unconditionally
Validated with
qualifications*
Not
validated
Lyon/Rouen
(UPMF)
N/A
Qualifications to validation
feedback in order to revise their
syntheses, even though some
suggest they should not have to ask
for them.
PVT5: Quality of
educational output
OVT5.1
The textual output that is
handed over to the teacher
is of better quality.
CNEDLyon/Rouen
(UPMF)
These results do not allow us to
validate that the textual output that is
handed over to the teacher is of
better quality. In average, the
Pensum group score is only
marginally superior to the control
group score (1st synthesis with a
fake interface) and does not provide
significant difference.
PVT6: Motivation for
learning
OVT6.1
The direct feedback
provided by the system
encourages learners to
undertake further study to
address gaps in their
coverage.
CNEDLyon/Rouen
(UPMF)
Page 220 of 349
Compared to the control group, the
feedback doesn‟t influence the
learning motivation.
1. Rather high agreement of
experimental group
2. Control group task and fake
interface were not designed to
motivate the participants.
3. The fact that questionnaires
were filled twice turned out
D7.4 - Validation 4
OVT
Operational Validation
Topic
Validated
unconditionally
Validated with
qualifications*
Not
validated
N/A
Qualifications to validation
tedious.
Nevertheless, correlations between
learners‟ opinion about the way they
experience Pensum‟s and the
number of feedback requested
shows that Pensum becomes a tool
that facilitates investment in learning
on the long run.
PVT7: Organisational
efficiency
OVT7.1
There is a saving in
institutional resources
overall
CNEDLyon/Rouen
(UPMF)
There is insufficient evidence to
prove that there is a saving in
resources.
PVT8: Relevance
OVT8.1
The service meets one or
more institutional objectives
CNEDLyon/Rouen
(UPMF)
PVT9: Likelihood of
adoption
OVT9.1
Users were motivated to
continue to use the system
after the end of the formal
validation activities
CNEDLyon/Rouen
(UPMF)
Page 221 of 349
Whereas both learner and tutors
would like to use Pensum after the
Pilot study, tutors show more
reluctance due to the absence of
proper
learner-teacher
communication device available from
within the system. Indeed, unless
D7.4 - Validation 4
OVT
Operational Validation
Topic
Validated
unconditionally
Validated with
qualifications*
Not
validated
N/A
Qualifications to validation
such functionality is implemented,
Pensum will be an extra resource to
integrate to their everyday practice.
OVT9.2
A high score was obtained
in the generic
questionnaires (based on
UTAUT: likelihood of
adoption by users).
CNEDLyon/Rouen
(UPMF)
The UTAUT result is the indicator of
the overall Pensum‟s evaluation.
Indeed Pensum seems to be usable
(positive opinion on Usability and
Facilitating conditions dimensions)
but its weaknesses notably on the
feedback Precision and Recall
influence learners‟ negative opinion
on Effectivenness, Efficiency and
Cognitive
Load
dimensions.
Consequently, the overall Pensum‟s
evaluation is not satisfactory.
OVT9.3
Tutors attending a
dissemination workshop
give high scores to the
question 'how likely are you
to consider adopting the
service in your own
educational practice?
CNEDLyon/Rouen
(UPMF)
Participants of the dissemination
workshop were not only tutors or
teachers, but also from different
institutions as industrialists, scholars,
and people in politics. Thus, it was
difficult to many of them to consider
adopting Pensum in their practice
because they have no actual
educational practice. However the
two teachers of the group indicated
their adoption willingness.
Page 222 of 349
D7.4 - Validation 4
Exploitation (SWOT Analysis)
The objective you are asked to consider is: "Pensum (v. 1.5) will be adopted in pedagogic contexts beyond the end of the project".
Strengths
The strengths of the system (v. 1.5) that would be positive indicators for adoption are:
Pensum provides on-demand support for learners engaged in writing syntheses
Learners who engaged with Pensum were statistically likely to want to use it again
Pensum provides learners with valuable hints concerning gaps in coherence and off-topic elements of their syntheses
Pensum saves tutor time on marking and supporting learners during the writing task (see OVT 2.1 and 7.1)
There is an indication that the quality of the final syntheses is better, where learners engage in requesting
feedback, though further work is required to confirm this
Pensum has no open source equivalent
Weaknesses
The weaknesses of the system (v1.5) that would be negative indicators for adoption are:
The lack of validity of the feedback, despite the self-regulated learning-based functionalities (e.g. with concept
missing feedback Pensum takes into account all lesson‟s sentences and teachers who take into account only the
main ideas, consequently the Precision and Recall rates are low)
Change in the focus for writing syntheses, from a word processor as the primary tool to Pensum,
The lack of textual enhancement functionalities (a synonym dictionary, boldface, highlighting, lists, etc.)
The interface management has to be improved:
a feedback zone different from the writing zone
a tutor interface separate to the writing interface and to the source text
Steep learning curve
Opportunities
The system has potential as follows:
Since many educational systems use synthesis or summary writing in their secondary levels, Pensum can be used
at these levels, provided that adapted corpora are processed beforehand.
Pensum can be used as a way to train tutors to be aware of the main features of the students‟ syntheses.
To promote Self-Regulated Learning. This novel approach of learning is the object of a large number of works and
Page 223 of 349
D7.4 - Validation 4
publications, so this increases the likelihood of Pensum being used and adopted by external researchers as ways
to study or promote students‟ self-regulated learning.
Threats
Tutor resistance regarding the transfer of their work to a machine
Too high expectations on the validity and the goals of the feedback may lead to inappropriate uses of Pensum. A
detrimental use of Pensum is to think it is likely to replace teachers instead to provide some hints and guidance to
learners in their writing.
If Pensum's compatibility with common e-learning standards (like Moodle or Dokeos) is low, then companies or
universities would be reluctant to adopt it.
Overall conclusion regarding the likelihood of adoption of Pensum Version 1.5:
Pensum v1.5 is not ready at the moment for wider adoption in educational settings, though there are indications that Pensum could, with
improvements, provide useful 'any time, any place' support to learners. Pensum has also demonstrated that it can save tutor time spent in
providing in-exercise feedback and final marking. Self-regulated learning is the object of a large number of research works and publications, and
Pensum is likely to be more sustainable in the research community than in educational settings immediately following the end of LTfLL.
The pilot demonstrated concerns about Pensum replacing tutors, so careful change management would be required for further implementation.
This would suggest that Pensum must be sufficiently ready to attract higher management interest, in order to provide a suitable environment for
change management. Learners also need encouragement to use Pensum: this pilot showed that validation results from learners who engaged with
requesting feedback had statistically higher scores on a number of markers.
This validation study highlights several problems that have to be resolved. Improvements that must be made are in four main directions. The main
problem concerns the precision of the feedback, as there are too many errors compared to experts‟ feedback. Consequently learners cannot trust
Pensum‟s feedback. This is likely to have had a very negative impact on a range of measures in the validation. Improvements to the user interface
and on-line interaction with tutors would also be desirable.
Up to now, Pensum has no open-source rival, though some research-based or commercial rivals of Pensum do exist (see D 5.3). The selfregulated learning -based writing-to-learn approach behind Pensum is promising and may lead to its adoption in several e-learning companies or
universities following further enhancement.
Most important actions to promote adoption of Pensum:
Technical
Solve the problem of feedback Precision and Recall.
Page 224 of 349
D7.4 - Validation 4
Improve the ergonomic interface and add some indispensable functionalities.
Research
Consider further the relationship between Pensum and a word processor, from the viewpoint of the user being primarily located within the
word processor
Continue dissemination to the research community
Exploitation in educational settings
The precision/recall issue must be improved before further pilots take place
Successful pilots are a prerequisite for further exploitation in educational settings
Careful change management with regard to tutors is an absolute requirement; learners also need to be managed to encourage them to
request feedback regularly
Improve the training of learners.
Page 225 of 349
D7.4 - Validation 4
Section 7 – Road map
Based on the results and conclusions from validation, the LTfLL team has agreed that the following are the most important future
enhancements to the system in order to meet stakeholder requirements:
Most important:
1. Improve feedback validity. Since Recall and Precision rates are sometimes low (respectively 39% and 26% for the off-topic feedback, 12%
and 34% for the concept missing feedback, 85% and 7% for the gap in coherence feedback; see OVT 1.1 to 1.3 and OVT3.5 from D7.4), most
learners do not fully trust the pieces of feedback prompted by Pensum Consequently an improvement might shift its focus on recall over precision,
once a sufficient level of the latter is achieved, depending on the user‟s response to scenario enhancement number 1 (below).
2. Annotation functionalities and ergonomic improvements of the interface. Learners and tutors proposed some important enhancements of
the interface: better text and synthesis display (independent, larger), text highlighting and commenting, synthesis formatting and eventually fully
taking advantage of AJAX to compute feedback on the fly and display it to the user as they type (see barriers to adoption from tutors and learners
interview, section 5 in D7.4).
3. Administrator interface. The managers agreed Pensum would be worth used in their educational contexts, but for a proper use, Pensum tutors
or teachers need a fully integrated specific interface for managing students, courses, LSA spaces and language.
Based on the results and conclusions from validation, the LTfLL team has agreed that the following are the most important changes to
the current scenario(s) of use in order to meet stakeholder requirements:
Most important:
1. Switch in focus on the feedback. Pensum can be viewed by the learner not as a feedback tool (which provides fully valid feedback every time)
but also as an annotation tool that guides him/her in the writing process. Pensum can be viewed as a checklist of questions to pose in the process
of understanding a course. To fully perform this scenario enhancement, system enhancement number 2 (above) should be completed as well.
2. Allow more appropriate learners scenarios of use and behaviors. The Round 3 Pilot study showed that only learners who persevere in the
use of Pensum could positively affect their efforts on their learning motivation (e.g., correlations in OVT 3.4 and 6.1 in D7.4). So new scenarios
would be worth devising. For instance:
Ask learners to work not with sentences but after grouping sentences, in allowing them to group together sentences to create sense
units before relaunching the analysis – this also might be a lead towards better feedback, system enhancement number 1 (above).
Since the more the learners used Pensum‟s feedback, the more their opinion on it was positive, a new scenario of use would require
the learners ask for feedback at predetermined moments of their production (e.g., once a paragraph is written).
Page 226 of 349
D7.4 - Validation 4
Enhance tutors‟ role within the scenario of use of Pensum and work on their training, as well as this of the teachers. Devise scenarios
that allow the learners to use Elgg as a social website for e-learning as a whole, and in which Pensum can be used.
3. Information toward managers. Managers‟ interview showed that Pensum do not have to be considered to save resources per se but rather to
smoothen the educational process. Consequently Pensum roll out needs a continuous and careful educational support for teachers and learners.
Based on the results and conclusions from validation, the LTfLL team has agreed that the following are possible additional educational
contexts for future deployment:
Most important:
Since many educational systems use synthesis or summary writing in their secondary levels, Pensum is likely to be used at these levels, provided
that adapted corpora are processed.
Based on the results and conclusions from validation, the LTfLL team has agreed that the following are the most important issues for
future technical research to enable deployment of language technologies in educational contexts:
Most important:
1. Consider alternatives to LSA for assessing free text. Recent research literature challenges the overwhelming domination of LSA (e.g.,
Pensum‟s engine) for assessing educational material. New methods have been proposed that aim at improving the basic similarity measures, like
Probabilistic LSA, Latent Dirichlet Allocation, or Random indexing.
2. System adaptability. Traces from user's behavior with our services can prove to be a reliable source of information towards adaptability. The
vector spaces chosen and the parameters used could be made to adapt better to variations of domain, source text, type of writing or even
according to learner/tutor interaction with the system. Indeed, we now have infrastructure to allow the user to change the different threshold values
and to explicitly question the feedback. Facilities towards implementing functionalities to allow the user to toy also with the vector space used have
already been implemented.
3. Usability. A problem for Pensum (and Conspect as well) was to import of documents from a variety of sources (Word, PDF, web pages). A
specific web-based tool would be very useful to provide this functionality. This tool would function as Firebug (http://getfirebug.com/) does for
selecting elements: the learner would select for importation the relevant elements (content as opposed to ads or navigation menus) and it would be
automatically uploaded and processed by the language technology-based application.
Roadmap - validation activities
Page 227 of 349
D7.4 - Validation 4
Further validation planned for beyond the end of the project:
Claim (OVT): The ergonomic interface is satisfactory
Methodology: Learner and tutor questionnaires
Claim (OVT): The Pensum‟s interface allow to read the source text satisfactorily
Methodology: Learner and tutor questionnaires
Claim (OVT): The Pensum‟s interface allows to write the synthesis satisfactorily
Methodology: Learner questionnaire
Claim (OVT): Interaction with the learner/tutor is facilitate with Pensum
Methodology: Learner and tutor questionnaires
Claim (OVT): Pensum‟s feedback is viewed as guidelines to work on a course rather than prescriptions to follow mandatorily.
Methodology: Learner questionnaire.
Claim (OVT): Pensum‟s guidance doesn‟t force the learner/tutor to adopt undesired work habits.
Methodology: Learner and tutor questionnaires.
Claim (OVT): The more learners ask for feedback (and tutors guide learners about it), the more their opinion on it is positive.
Methodology: Learners and tutor questionnaires, trace data.
Page 228 of 349
D7.4 - Validation 4
Page 229 of 349
D7.4 - Validation 4
Appendix B.7 Validation Reporting Template for WP6.1 (IPP-BAS & Sofia University)
Section 1: Functionality implemented in Version 1.5 and alpha / beta-testing record
Brief description of functionality
Version
number of unit
Changes from Version 1.0
Annotation service
v1.5
In addition to the linguistic pipe that processes learning objects in English (for
them to become searchable within FLSS), another one was compiled - for
processing learning objects in Bulgarian. In this way, the stakeholders have the
possibility to explore the multilingual facility – i.e. to retrieve relevant texts in
two languages instead of only one. Also, the efficiency of the annotation
process itself was optimized. Thus, the addition of newly processed materials
to the repository is done to make it easier and faster.
Lexicalisation service
v1.5
The ontological new concepts have been tuned to the language specific
lexicons - the newly added words and phrases from English, and especially –
from Bulgarian materials. In this way, the semantic search within the repository
will be more precise and with a better coverage.
Statistics element
v1.5
The service displays the number of concept occurrences per document on
stakeholder‟s request. In this way, considering the frequency of the present
concepts, the stakeholder can get a better impression whether the document is
relevant to the topic, or not.
Alpha-testing
Pilot site and language
IPP-BAS (Bulgarian)
Date of completion of alpha testing:
20 Sept 2010
Page 230 of 349
D7.4 - Validation 4
Who performed the alpha testing?
Kiril Simov, Petya Osenova, Laska Laskova
Beta-testing
Pilot site and language: IPP-BAS(Bulgarian)
Has stakeholder validation taken place using the service embedded in Elgg (Yes/No/Partially): Partially
If ‘No’ or ‘Partially’, give reasons: The service has been embedded in Elgg, but the components require full screen
beta-testing performed by: Stanislava Kancheva (tutor), Alexander Savkov (tutor)
beta testing environment (stand-alone service / integrated into Elgg): integrated into Elgg
HANDOVER DATE:
15 Oct 2010
Page 231 of 349
D7.4 - Validation 4
Section 2: Validation Pilot Overview
NB Information about pilot sites, courses and participants has been transferred to Appendix A.3
Pilot task
Pilot site:
IPP-BAS, Sofia University
Pilot language: Bulgarian
What is the pilot task for tutors and how do they interact with the system?
The tutors have to structure a course unit within an introductory course in IT domain, called “Introduction to HTML”. To select the topics, they use
the information, provided by the domain ontology, and for selecting the relevant learning materials, they rely on the semantic search facility.
What do the tutors produce as outputs?
Selected materials and a draft course unit. The former are stored in the FLSS repository, while the latter is created outside the system.
How long does the pilot task last, from the tutors starting the task to their final involvement with the software?
Two weeks
How do tutors/student facilitators interact with the learners and the system?
There is no interaction with the learners - the goal is to develop course units on a selected topic in the IT domain. Tutors interact directly with the
system in time of their preference - at work or at home, or both. There are no limitations on the number of interactions.
Describe any manual intervention of the LTfLL team in the pilot:
The tutor has to write down the structure of the course unit in a file, stored outside the system. He/she has also to add the titles of the found
relevant learning objects within the system.
Experiments
Experiment 1 (only verification experiment):
Page 232 of 349
D7.4 - Validation 4
Name of experiment: Evaluating the output of the three versions of the language pipe
Objective(s): The FLSS team needs to know how efficient will be the addition of new learning material to the repository. On the other hand, the
user needs to know what annotation suffices for her/his needs. Pipe 0.3 was the only version used as language pipe in version 1 of the semantic
annotation. Some users reported during the validation that the annotation process was too slow. Thus, the semantic annotation service has been
split into different language pipes, each of these adds a specific annotation.
Details: The verification was performed by comparing the output of the three pipes with each other and to the gold standard. We selected three
learning objects in the sub domain of HTML and annotated each of them using all three pipes.
The results are as follows: Pipe 0.1 annotated 316 concepts; Pipe 0.2 annotated 282 concepts; and Pipe 0.3 annotated 299 concepts. The set of
annotations produced by Pipe 0.3 completely contains the set produced by Pipe 0.2. Pipe 0.3 distributed via coreferential chains 17 concepts to
pronouns, and 11 terms were annotated with more specific concepts. For the 11 more specific concepts we considered the old annotations when
we compared Pipe 0.2 and Pipe 0.3. This comparison shows that usage of Pipe 0.2 reduces the number of the annotated phrases in the text, but
is comparable with Pipe 0.3 concerning the concept coverage. Pipe 0.1 wrongly annotated 34 concepts compared to the output of Pipe 0.2. From
these 34 concept annotations, 6 concepts are unique. The error rate is 10.7 %.
Experiment 2 (also validated in OVT 1.1):
Name of experiment: Semantic search verification
Objective(s): Aims at providing evidence that the service returns relevant learning objects.
Details: The verification was organized as a workshop with five tutors from IPP-BAS. The work was done in the period 27.09.2010 to 1.10.2010
with version 1.5 of the FLSS services. The tutors have been divided into two groups. The first one (two tutors) was shown a list of learning
materials related to HTML which are available within the FLSS. They were asked to choose three topics and to augment them with all the relevant
learning objects in the system. The result from this activity was a gold standard with respect to the specified topics and their underlying learning
objects. The other group (three tutors) was asked to formulate queries with respect to one of the topics and to perform a semantic search for
relevant material. The retrieved materials were then automatically compared to the gold standard for each topic.
Two metrics have been calculated: Precision and Recall. For the semantic search verification the Precision is considered more important, since
the retrieved material must be relevant to a topic. However, also Recall has been considered, because it leads to the conclusion that specific
searches should be related to specific topics. Otherwise, recall drops.
NB The results of this experiment is presented into more detail in D6.3.
Page 233 of 349
D7.4 - Validation 4
Section 3: Results - validation/verification of Validation Topics
OVT:
1.1
Pilot site
IPP-BAS
Pilot language
Bulgarian
Operational Validation Topic: The high proportion of the learning objects (LOs), offered by the
system, is relevant to the topic, chosen by the teacher.
Summative results with respect to validation indicator:
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: LTfLL staff from IPP-BAS (n=5); 3 topics have been chosen; 2 people selected manually the relevant LОs from the
database; 3 people performed an automatic semantic search and reported the results. Both data have been compared wrt Recall and Precision.
Results: In the following table the Precision and Recall are presented. It turned out that 8 from 12 queries resulted in a precision higher than
50%. This result from verification shows that people learn very effectively how to specify queries which match their expectations with a high level
of precision. This generally means to retrieve the appropriate materials fast. As expected, the Recall drops when the search becomes too specific
with respect to a broadly defined topic. In order to explore the whole topic, the user needs to ask several queries. This observation leads us to plan
a new extension of the query mechanism with a set of queries.
Concept
lt4el:Table
lt4el:TableTag
lt4el:TableCell
lt4el:CaptionTag
lt4el:TRTag
lt4el:HTMLFontRelatedTag
Precision
16,5 %
81,8 %
82,4 %
100 %
72,7 %
30,3 %
Recall
93,8 %
56,3 %
87,5 %
6,3 %
50 %
73,6 %
Page 234 of 349
D7.4 - Validation 4
lt4el:BoldTag & lt4el:HTML
lt4el:ItalicTag & lt4el:HTML
lt4el:BasefontTag
lt4el:Image
lt4el:Image & lt4el:HTML
lt4el:Image & lt4el:HTMLTag
OVT:
2.1
Pilot site
IPP-BAS
67,4 %
63,5 %
100 %
13,7 %
38,1 %
67,3 %
Pilot language
Bulgarian
53,7 %
55,3 %
7,7%
86,4 %
72,4 %
40,4 %
Operational Validation Topic: The teacher saves time when developing a course unit in FLSS
compared to traditional means.
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
7. It takes less time to complete my teaching tasks using FLSS than
without the system.
Experimental
4.7
0.58
100%
3
Tutors
8. Using FLSS enables me to work more quickly than without the system.
Experimental
4.7
0.58
100%
3
Tutors
9. I do not wait too long before receiving the requested information.
Experimental
4.0
1.00
67%
3
Tutors
34. It takes me less time to develop a course unit using FLSS, than
without the system.
Experimental
4.3
1.15
67%
3
Tutors
35. I find using FLSS to develop a course unit is a very time-efficient way
of developing a course.
Experimental
4.7
0.58
100%
3
Tutors
36. I find the process of developing a course unit is quicker using FLSS,
compared with not using the system.
Experimental
4.7
0.58
100%
3
Page 235 of 349
D7.4 - Validation 4
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
Focus Group 1: “The system helps me to prepare a course faster than usual, because I do nоt waste time on searching for
materials on the net”
Focus Group 1: “Given the uploaded materials are approved by a competent tutoring authority, I would rather use the system
than search, examine and select materials myself.”
Focus Group 1: “I save time, because the repository provides already selected materials per topic.”
OVT:
2.1
Pilot site
Sofia
University
Pilot language
Bulgarian
Operational Validation Topic: The teacher saves time when developing a course unit in FLSS compared to
traditional means.
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
7. It takes less time to complete my teaching tasks using FLSS than
without the system.
Experimental
4.5
1.00
75%
4
Tutors
8. Using FLSS enables me to work more quickly than without the system.
Experimental
4.3
0.96
75%
4
Tutors
9. I do not wait too long before receiving the requested information.
Experimental
3.8
1.89
75%
4
Tutors
34. It takes me less time to develop a course unit using FLSS, than
without the system.
Experimental
4.3
0.50
100%
4
Tutors
35. I find using FLSS to develop a course unit is a very time-efficient way
of developing a course.
Experimental
4.5
0.58
100%
4
Page 236 of 349
D7.4 - Validation 4
Tutors
36. I find the process of developing a course unit is quicker using FLSS,
compared with not using the system.
Experimental
4.8
0.50
100%
4
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
Focus Group 1: “Simplify the user interface and that will speed up the process of getting used to the system even more”
OVT:
2.2
Pilot site
IPP-BAS
Pilot language
Bulgarian
Operational Validation Topic: The teacher invests fewer efforts (cognitive load) when developing a course
unit in FLSS compared to traditional means.
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
11a. Please rank on a 5-point scale the mental effort (1 = very low
mental effort; 5 = very high mental effort) you invested to accomplish
teaching tasks using FLSS.
Experimental
2.3
0.58
33%
3
Tutors
11b. Overall, using the system requires significantly less mental effort to
complete my teaching tasks than when using an Internet browser.
Experimental
4.7
0.58
100%
3
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
Focus Group 1: “Working with FLSS requires less effort than using a browser to search for materials - the information is
structured, the materials are retrieved semantically.”
Page 237 of 349
D7.4 - Validation 4
OVT:
2.2
Pilot site
Sofia
University
Pilot language
Bulgarian
Operational Validation Topic: The teacher invests fewer efforts (cognitive load) when developing a course
unit in FLSS compared to traditional means.
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
11a. Please rank on a 5-point scale the mental effort (1 = very low
mental effort; 5 = very high mental effort) you invested to accomplish
teaching tasks using FLSS.
Experimental
3.0
0.82
75%
4
Tutors
11b. Overall, using the system requires significantly less mental effort to
complete my teaching tasks than when using an Internet browser.
Experimental
2.8
0.96
25%
4
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
Focus Group 1: “It takes time to understand how the system works, to understand the difference between the word-based
approach and the FLSS concept-based approach, but after that it‟s really easy”.
OVT:
3.1
Pilot site
IPP-BAS
Pilot language
Bulgarian
Operational Validation Topic: Teachers perceive that the learning materials offered by FLSS are useful to
them in developing a course unit.
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
6. The information the system provides me is accurate enough for
helping me perform my teaching tasks.
Experimental
4.0
1.00
67%
3
Tutors
37. FLSS provides learning materials that are relevant to my topic
Experimental
4.0
1.00
67%
3
Page 238 of 349
D7.4 - Validation 4
Tutors
38. The learning materials retrieved are useful for the course design.
Experimental
4.0
0.00
100%
3
Tutors
39. The majority of the retrieved learning objects fit my course topic.
Experimental
3.7
0.58
67%
3
Tutors
40. I trust the system to offer me learning materials useful for the course I
am designing.
Experimental
4.3
0.58
100%
3
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
Focus Group 1: “Probably for the regular teacher in a non-academic environment it would be a bit tricky to add new learning
objects so it would be better for FLSS staff to provide more.”
Tutors
Focus Group 1: “I think the materials available in the system were suitable for an introductory course, but not as a source for
the advanced students.”
OVT:
3.1
Pilot site
Sofia
University
Pilot language
Bulgarian
Operational Validation Topic: Teachers perceive that the learning materials offered by FLSS are useful to
them in developing a course unit.
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
6. The information the system provides me is accurate enough for
helping me perform my teaching tasks.
Experimental
4.5
1.00
75%
4
Tutors
37. FLSS provides learning materials that are relevant to my topic
Experimental
4.3
1.50
75%
4
Tutors
38. The learning materials retrieved are useful for the course design.
Experimental
4.3
0.96
75%
4
Tutors
39. The majority of the retrieved learning objects fit my course topic.
Experimental
4.0
0.82
75%
4
Tutors
40. I trust the system to offer me learning materials useful for the course I
am designing.
Experimental
4.5
1.00
75%
4
Page 239 of 349
D7.4 - Validation 4
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
Focus Group 1: “The materials fit to the curriculum and it‟s easy to decide which documents to recommend.”
Tutors
Focus Group 1: “Probably we are going to use FLSS for redesign of some parts of the courses “Administering SQL Server”
and/or “Querying MS SQL Server”.
OVT:
4.1
Pilot site
IPP-BAS
Pilot language
Bulgarian
Operational Validation Topic: Using the ontology assists the teacher in establishing the hierarchy of
main concepts within the course unit.
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
41. Browsing the ontology helps me decide which topics to include in my
course.
Experimental
4.0
1.00
67%
3
Tutors
42. Browsing the ontology helps me to structure my course in a
comprehensive way (so that all important aspects are covered).
Experimental
4.3
0.58
100%
3
Tutors
43. Browsing the ontology helps me see the relationships between
different concepts in my course.
Experimental
4.3
1.15
67%
3
Tutors
44. Browsing the ontology helps me to introduce concepts in a logical
order in my course.
Experimental
4.3
1.15
67%
3
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
Focus Group 1: “Usually when I prepare a course, I already have an idea about the course structure grounded in my own
understanding about the educational process, but it is always useful to get another point of view.”
Tutors
Focus Group 1: “I would prefer the thematic classification to the ontological one. For example, if I search for concepts, related
Page 240 of 349
D7.4 - Validation 4
to „font‟, I would like to see also size, font type etc. in the same place.”
OVT:
4.1
Pilot site
Sofia
university
Pilot language
Bulgarian
Operational Validation Topic: Using the ontology assists the teacher in establishing the hierarchy of
main concepts within the course unit.
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
41. Browsing the ontology helps me decide which topics to include in my
course.
Experimental
4.5
0.96
75%
4
Tutors
42. Browsing the ontology helps me to structure my course in a
comprehensive way (so that all important aspects are covered).
Experimental
4.0
0.82
75%
4
Tutors
43. Browsing the ontology helps me see the relationships between
different concepts in my course.
Experimental
4.5
1.00
75%
4
Tutors
44. Browsing the ontology helps me to introduce concepts in a logical
order in my course.
Experimental
4.3
1.00
75%
4
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
Focus Group 1: “The ontology helps a lot in constructing a hierarchy of domain specific concepts, which forms the basis of the
course.”
Tutors
Focus Group 1: “The present ontology helps me if I prepare a basic course. However, I think that for master and PhD courses I
would need a more complex resource as a support service, which includes more relations besides „is-a‟.”
Page 241 of 349
D7.4 - Validation 4
OVT:
5.1
Pilot site
IPP-BAS
Pilot language
Bulgarian
Operational Validation Topic: The teacher thinks that the quality of the derived main structure of a
course, together with its relevant support material, is good.
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
5. The FLSS helps me to improve the quality of my support to learners.
Experimental
4.3
0.58
100%
3
Tutors
45. I believe that the main structure of the course I have designed is of
good quality.
Experimental
4.0
1.00
67%
3
Tutors
46. I believe that the content of the course I have designed is of good
quality.
Experimental
4.0
1.00
67%
3
Tutors
47. Overall, I am satisfied with the course I have designed.
Experimental
4.0
0.00
100%
3
Tutors
48. I believe that FLSS has helped me design a better course than when
using traditional means.
Experimental
4.3
0.58
100%
3
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
Focus Group 1: “Once I get an idea about how the information on the course topic is structured, it is much easier to explore
different curriculum models consistent with the level of knowledge of the learners group.”
Tutors
Focus Group 1: “I‟m quite satisfied with my course structure.”
Page 242 of 349
D7.4 - Validation 4
OVT:
5.1
Pilot site
Sofia
university
Pilot language
Bulgarian
Operational Validation Topic: The teacher thinks that the quality of the derived main structure of a
course, together with its relevant support material, is good.
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
5. The FLSS helps me to improve the quality of my support to learners.
Experimental
4.8
0.50
100%
4
Tutors
45. I believe that the main structure of the course I have designed is of
good quality.
Experimental
4.0
0.82
75%
4
Tutors
46. I believe that the content of the course I have designed is of good
quality.
Experimental
4.5
1.00
75%
4
Tutors
47. Overall, I am satisfied with the course I have designed.
Experimental
4.3
0.50
100%
4
Tutors
48. I believe that FLSS has helped me design a better course than when
using traditional means.
Experimental
4.5
0.58
100%
4
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
Focus Group 1: “The good point is that students can easily find suitable materials on any of the topics in my course.”
Tutors
Focus Group 1: “I wish the depth of the suggested structure to be more balanced. I discovered that sometimes the system
helps me with more elaborated hierarchical structure, but sometimes it seems to be flat. Then I have to use other ways in order
to make it sufficiently detailed.”
Page 243 of 349
D7.4 - Validation 4
OVT:
5.2
Pilot site
IPP-BAS
Pilot language
Bulgarian
Operational Validation Topic: An advantage of FLSS is that the search can return learning materials in other
languages, providing teachers with a wider range of materials for multi-lingual learners
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
49. It is useful to be able to include learning materials in more than one
language (e.g. English, Bulgarian) in my course unit.
Experimental
5.0
0.00
100%
3
Tutors
50. FLSS helps me find useful learning materials in more than one
language
Experimental
4.0
0.00
100%
3
Experimental
4.0
1.00
67%
3
51. FLSS provides me with a better choice of learning materials because
it offers me materials in more than one language.
Formative results with respect to validation indicator
Tutors
Stakeholder type
Results
Tutors
Focus Group 1: “Since most of my students read in English, and the lexicon in this particular domain is strongly influenced by
English terminology, it‟s very useful to have documents on one topic in the two languages - Bulgarian and English.”
OVT:
5.2
Pilot site
Sofia
university
Questionnaire
type
Pilot language
Bulgarian
Operational Validation Topic: An advantage of FLSS is that the search can return learning materials in other
languages, providing teachers with a wider range of materials for multi-lingual learners
Experimental
/ control
group
Questionnaire no. & statement
Page 244 of 349
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
D7.4 - Validation 4
49. It is useful to be able to include learning materials in more than one
language (e.g. English, Bulgarian) in my course unit.
Tutors
50. FLSS helps me find useful learning materials in more than one
language
51. FLSS provides me with a better choice of learning materials because
Tutors
it offers me materials in more than one language.
Formative results with respect to validation indicator
Tutors
Experimental
4.8
0.50
100%
4
Experimental
4.8
0.50
100%
4
Experimental
3.8
0.96
50%
4
Stakeholder type
Results
Tutors
Focus Group 1: “As expected, there are a lot more materials in Bulgarian, than in English within the available LOs. But the
quality of the latter is better.”
OVT:
7.1
Pilot site
IPP-BAS
Pilot language
Bulgarian
Operational Validation Topic: There is a saving in institutional resources overall
Formative results with respect to validation indicator:
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Interview with tutors from IPP-BAS (n=4).
Results: The tutors pointed out that there are several conditions that have to be fulfilled for the system to save institutional resources:
- to build a repository with sufficient number of learning objects, that once established, can be easily updated with new documents, while keeping
the quality of selections per topic;
- to be able to modify the ontologies; for example, to add new concepts or new lexicalisations.
When these conditions are met, FLSS could prove to be very efficient in reducing most of the time-consuming activities related to teaching
preparation.
Page 245 of 349
D7.4 - Validation 4
OVT:
8.1
Pilot site
IPP-BAS
Pilot language
Bulgarian
Operational Validation Topic: The service meets one or more institutional objectives
Formative results with respect to validation indicator:
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Interview with teaching manager from IPP-BAS (n=1).
Results: Our teachers and our students are respectively specialized/specializing in IT area. Thus, they are very demanding when considering the
adoption of new software. It takes time in order to trust it, and usually our politics is to adopt an already world-wide popular system (such as
Moodle or ILIAS). On the other hand, we are open to experiments with new systems. If it remains open for public usage and if our tutors (and
students) start to use it, we can consider the institutional adoption in more serious terms.
To share my own opinion on why I think FLSS is likely to be used: In terms of efficiency, one of the main advantages of the system is the
reusability of the learning objects. The benefits are twofold – tutors can select from a number of materials and students have immediate access to
them. The number of regular students who go to work is increasing and the system may provide a way to cope with this problem.
Another “burning issue” right now is the ratio between teaching and research time of the tutors, so any tool that can decrease the former in favour
of the latter, while keeping and even improving the quality of the educational process, is welcome.
The service can be used also for the purposes of internal training, group role detection, as a tool to inspect the students‟ outcome of learning by
authoring activity.
The usage of FLSS depends on different integration issues, licensing and many other requirements that might arise.
OVT:
8.1
Pilot site
Sofia
University
Pilot language
Bulgarian
Operational Validation Topic: The service meets one or more institutional objectives
Formative results with respect to validation indicator:
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Interview with teaching manager from Sofia University (n=1).
Page 246 of 349
D7.4 - Validation 4
Results: I am Vice-Dean who is responsible for the BA programs at Faculty of Slavic languages. In our Faculty, there are already some optional
classes at BA level in IT in order to prepare the students for the MA program in Computational Linguistics and for working with more advanced
tools over the language. We are at the beginning of experimenting with eLearning courses in various areas. Thus, we trust Moodle, for example.
FLSS is very suitable for our purposes due to the following reasons: 1. the tutors that have been involved in the validation, share their opinion that
after the initial effort in getting acquainted with the system, they started to use it easily and became eager to master in it; 2. FLSS is free, webbased, the support team is very close to us and we can rely on them; and 3. the services suit our requirements for basic (not advanced) courses in
IT. In our educational system, the tutor has the freedom to choose her/his approaches to make a course or to work with the students. I can only
support the popularization of FLSS within our Faculty. I would be more confident, however, if Faculty of Mathematics and Informatics share their
opinion on this system.
OVT:
9.1
Pilot site
IPP-BAS
Pilot language
Bulgarian
Users were motivated to continue to use the system after the end of the formal validation activities
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
21. I would recommend this system to other teachers to help them in
their teaching.
Experimental
4.7
0.58
100%
3
Tutors
22. I am eager to explore different things with FLSS.
Experimental
4.7
0.58
100%
3
Tutors
29. I would like to use the service in my teaching after the pilot.
Experimental
4.7
0.58
100%
3
Tutors
30. If the service is available after the pilot, I will definitely use it in my
teaching.
Experimental
4.7
0.58
100%
3
Formative results with respect to validation indicator
The system remained opened after the validation. The stakeholders were ensured that the FLSS team could help them with additional materials
into the repository, processing and handling a new ontology with a lexicon (if needed). The participants expressed interest in using it. However, at
this time of the academic year nobody had a course to prepare in IT. For that reason, we suggested this OVT to be validated additionally in the
questionnaire.
Page 247 of 349
D7.4 - Validation 4
Stakeholder type
Results
Tutors
Focus Group 1: "The work with the system stimulates me to experiment and try different ways to organize my course, so I
intend to use it in the future.”
Tutors in
dissemination
workshop
Focus Group 1: “I would consider adopting the software because it provides a useful framework to optimize the learning
processes.”
OVT:
9.1
Pilot site
Sofia
University
Pilot language
Bulgarian
Users were motivated to continue to use the system after the end of the formal validation activities
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
agree
n=
Tutors
21. I would recommend this system to other teachers to help them in
their teaching.
Experimental
5.0
0.00
100%
4
Tutors
22. I am eager to explore different things with FLSS.
Experimental
4.8
0.50
100%
4
Tutors
29. I would like to use the service in my teaching after the pilot.
Experimental
5.00
0.00
100%
4
Tutors
30. If the service is available after the pilot, I will definitely use it in my
teaching.
Experimental
5.00
0.00
100%
4
Formative results with respect to validation indicator
The system remained opened after the validation. The stakeholders were ensured that the FLSS team could help them with additional materials
into the repository, processing and handling a new ontology with a lexicon (if needed). The participants expressed interest in using it initially as it
is, since the courses are basic at a Humanity Faculty. One of the tutors kept using it, since this semester he has a course in IT at MA level. In
order to get other stakeholders‟ opinion, we suggested this OVT to be validated in the questionnaire.
Stakeholder type
Results
Page 248 of 349
D7.4 - Validation 4
Tutors
Focus Group 1: “While I was testing the system, I kept thinking “What if it was not an IT domain ontology, but Linguistic
ontology - that would be really nice!” - so I would very much like to use the FLSS again to my benefit.”
Tutors
Focus Group 1: “It was fun exploring the system, so I would recommend it to my colleagues.”
Page 249 of 349
D7.4 - Validation 4
OVT:
9.2
Pilot site
IPP-BAS
Sofia
University
A high score was obtained in the generic questionnaires (based on UTAUT: likelihood of adoption).
Pilot language
Bulgarian
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Generic questionnaire (tutors). Because of low numbers, results were aggregated for the two pilot sites
Results:
Descriptive Statistics - Tutors
N
Mean
Std. Deviation
Effectiveness
7
4,55
,356
Efficiency
7
4,21
,918
Cognitive Load
7
3,14
,378
Usability
7
4,20
,566
Satisfaction
7
4,71
,300
Facilitating conditions
7
4,29
,591
Self-Efficacy
7
4,10
,738
Behavioural intention
7
4,86
,378
Transferability
7
3,86
,476
IPP-BAS & SU
7
4,38
,414
Valid N (listwise)
7
Page 250 of 349
D7.4 - Validation 4
OVT:
9.3
Pilot site
IPP-BAS
Pilot language
Bulgarian
Tutors attending a dissemination workshop give high scores to the question 'how likely are you to consider
adopting the service in your own educational practice?
Formative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Tutors from IPP-BAS (n=8)
Results: 6 tutors stated that they would like to adopt the service for their own purposes, while 2 were not sure whether the service is useful to
them at this stage of its development. The positive tutors are involved in European or national projects, I which they see a perspective to rely on
FLSS for their work.
Tutors
OVT:
9.3
“We are developing a national project for creating new application-oriented methods and end-user oriented tools for Semantic
Web Service descriptions oriented to Technology Enhanced Learning (http://sinus.iinf.bas.bg/index.php ). The project uses
“learning by authoring” approach in the Bulgarian Iconography domain. The learners are crating a multimedia document that
contains primary multimedia recourses and some texts. We might use some part of FLSS in order to support the instructor
when she reviews the progress of the learners by mean to support them in their work.”
Pilot site
Sofia
University
Pilot language
Bulgarian
Tutors attending a dissemination workshop give high scores to the question 'how likely are you to consider
adopting the service in your own educational practice?
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Tutors from Sofia University (n=4)
Results: All the tutors stated that they would like to use the service for their own tasks.
Page 251 of 349
D7.4 - Validation 4
Tutors
OVT:
9.4
“I can prepare my course on mark-up languages for the MA in Computational Linguistics.”
Pilot site
IPP-BAS
Pilot language
Bulgarian
Teachers and managers are motivated to adopt the system, because it suggests multilingual search.
Formative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Tutors from IPP-BAS (focus group, n=3)
Results: All tutors included materials both in Bulgarian and English. They stated that one of the most obvious advantages of FLSS is its
multilinguality. Even though there is a shared observation that students are more willing to read materials in one language, still this feature allows
more options for students with different needs and preferences.
OVT:
9.4
Pilot site
Sofia
University
Pilot language
Bulgarian
Teachers and managers are motivated to adopt the system, because it suggests multilingual search.
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Tutors from Sofia University (focus group, n=4)
Results: Again, both materials in Bulgarian and English were selected as a result from testing the system. The main argument was that in the IT
domain English tends to be the “working language” and even if there are sufficient LOs in Bulgarian, it is almost necessary to provide materials in
English – there are a lot of terms in the Bulgarian IT domain lexicon that are not unanimously accepted.
Page 252 of 349
D7.4 - Validation 4
Section 4: Results – validation activities informing future changes / enhancements to the system
VALIDATION
ACTIVITY
Pilot partner: IPP-BAS
Service language: English
Additional formative results (not associated with validation topics)
Alpha testing
Annotation visualisation is confusing when there are more than one annotations / comments.
Time for ontology load is too long in case two lexicons for two languages are attached to it.
Beta testing
Multiple upload of documents does not work.
Management over the retrieved documents is problematic (if you focus on one piece, the other disappear).
Tutor interviews
Text extraction from different types of documents is not done within the system.
The upper part of the ontology is not understandable to the users. It is too abstract with respect to the specific topic of
interest.
It is difficult initially to remember the sequence of all steps in FLSS, when handling learning materials.
The definitions of the concepts are not in Bulgarian, and they disappear quickly, which hampers the work process.
Within the bunch of the retrieved documents it is difficult to mark and manage the current selection.
At first glance it seems that one and the same concept appears in several places in the ontology. Stakeholders do not
understand why. They prefer the discriminative kind of information representation.
Tutor workshop(s)
When a concept is missing, it is difficult to be handled by the ontology enrichment service. There is no clear
procedure how to do that.
There is no aggregated statistic information, provided for the user defined groups of LOs.
It is confusing to be shown the possibility of opening either the repository, or the ontology. These steps should be
ordered. For example, first – the ontology, second, the repository, or vice versa.
It is not clear how to find the most appropriate concept for a topic when browsing the ontology.
There is no clear mechanism on how to adapt the domain ontology to newly coming information in the area.
Page 253 of 349
D7.4 - Validation 4
I still need time to decide how I might use FLSS as a supplemental tool to LMS and my traditional methods.
I need to invest some efforts in adapting the semantic search module in FLSS for the purpose of our national project
on Semantic technologies for Web Services and technologically supported learning (Д-002-189/16.12.08).
Teaching manager
interview
“I am not satisfied with the fact that the control over the course creation process is not entirely controlled by FLSS.
Thus, the ready structure and related LOs cannot be validated within the system.”
“I am not satisfied with the fact that the interface is only in English. The native language of the learners should be
equally supported by the system.”
VALIDATION
ACTIVITY
Pilot partner: Sofia University
Service language: English
Additional formative results (not associated with validation topics)
Alpha testing
N/A (alpha testing had been done only at IPP-BAS)
Beta testing
N/A (beta testing had been done only at IPP-BAS, since the participants from SU wanted to test the ready version of
the software)
Tutor interviews
Documentation lacks on some basic, but important steps in the course creation process. It slows down the work.
The tutor is confused what to do next, when all the necessary support windows with information are opened.
The tutor is initially confused with the procedure of saving documents at different stages of his/her work. It is not
always clear what you save exactly and where.
An aspect of usage is pre-testing the learners, but this feature is very sensitive to different licensing, integration and
organizational issues.
Tutor workshop(s)
The main problem remains how to add new concepts and new materials into FLSS.
The interface in some parts has a smaller resolution, which leads to waste of time, when trying to adjust it
accordingly.
The big number of error messages that pop up at different stages of work become quite annoying.
Page 254 of 349
D7.4 - Validation 4
The fact that you have to write down the course out of FLSS is annoying.
Teaching manager
interview
“There are only initial steps in the system towards ensuring interactivity among tutors (for example, exchange of
comments on their opinions and deletions, combinations, insertions of learning material).”
“I am not quite sure how FLSS might fit the Moodle architecture”.
Page 255 of 349
D7.4 - Validation 4
Section 5: Results – validation activities informing transferability, exploitation and barriers to adoption
VALIDATION
ACTIVITY
Partner(s) involved: IPP-BAS
Service language: Bulgarian
Additional formative results (not associated with validation topics)
Alpha testing
Major issues encountered in transferring FLSS to Bulgarian:
Concept annotation accuracy for morphologically rich language.
The high frequency of English terms in Bulgarian texts called for the application of NLP components for both
languages
Major issues encountered in transferring FLSS to IT domain:
Sparseness of up-to-date advanced Bulgarian LOs in the IT domain
Beta testing
Bulgarian IT domain lexicon variability and the quality of the documents in this language influence the performance of
the system.
Localisation of the interface is desirable.
Tutor interviews
The users would like to see Bulgarian language in more active use. For example, the QuickFind facility operates only
on URI.
There are no sub-domains in IT, elaborated enough for the teaching purposes.
The tutors would not like to do the pre-selection and processing of the learning data themselves. They feel uncertain
about when, how and by whom these adjustments would be made.
I would be surprised, if the system could be used in the Mathematics domain, since FLSS relies on coherent text
data, not on formulas, tables, numbers.
It could be hard for regular teachers to add new learning objects so it is better to develop them.
Further user evaluations are needed to advertise successfully a system like FLSS.
I need to work longer with FLSS before I can give any ideas for its further development.
Tutor workshop(s)
The FLSS approach should be offered together with a relevant pedagogical strategy what to do next.
Page 256 of 349
D7.4 - Validation 4
The corpus of learning objects is too small, especially in Bulgarian.
Data-driven approaches to teaching are limited, because the ontology imposes some constrains over the domain
modelling.
Teaching manager
interview
“Adopting the system institutionally means more workload for our administration.”
“There are members of the teaching staff that will initially refuse to use the system due to their unawareness of
system‟s advantages.”
“Our courses are specialized in the IT domain. Thus they already presuppose some basic knowledge of the learners
in the area. In this respect the ontology coverage and the number and variety of learning objects is not enough”.
“I want to test FLSS in a real setting, and get more feedback from both – tutors and students. Then I can consider the
system for adoption.”
VALIDATION
ACTIVITY
Partner(s) involved: Sofia University
Service language: Bulgarian
Additional formative results (not associated with validation topics)
Alpha testing
N/A, system is only alpha tested at IPP-BAS
Beta testing
N/A (FLSS only tested at IPP-BAS, since SU participants wanted to test the ready version of the system)
Tutor interviews
It takes some time until the tutors understand how to use the system in the best way: complementary to their
traditional means or as a substitution to them.
There is no enough metadata about the present learning objects for the tutor to get oriented quickly.
The presentational aspect of displaying the learning object is very limited. No partitioning into Introduction,
Illustrations, Exercises, etc.
Tutor workshop(s)
Building a repository of LOs will require more efforts from the staff at the beginning. This might have negative effect
on their motivation to use the services.
In order for the tutors to add their own materials into FLSS and to process them successfully, they need more
Page 257 of 349
D7.4 - Validation 4
background information what the idea and architecture is behind the system.
For some topics, tutors get a partial or schematic structure for the intended course.
Teaching manager
interview
“At the moment there is no a well-established program for getting basic knowledge in IT area at the Faculty (only one
optional course for BA and an MA in Computational Linguistics). This limits the interested tutors to the number of only
few.”
The Faculty cooperates closely with the Faculty of Mathematics and Informatics. They use Moodle for eLearning.
Thus, the manager would prefer the FLSS team to try to establish FLSS there in the first place. The manager thinks
they are more experienced, and therefore – he would appreciate their opinion.
FLSS is a web-based facility, but the manager is uncertain on whether it would work properly if more people start to
use it.
Transferability questionnaire: Institutional policies and practices
Sofia University uses Moodle as the only learning platform.
Staff at IPP-BAS do not use systems for course preparations, and while some tutors could adopt the services in their work, we do not expect
IPP-BAS to impose this.
Transferability questionnaire: Relevance of the service in other pedagogic settings
Pedagogic setting
Reason(s)
Pedagogic settings for which the service would be
suitable:
self-directed learning
directed learning
course creation
revising for exams
Semantic search allows better location of the necessary information.
The pre-selection of the learning materials in a specific (sub)domain saves time
to the stakeholders, since it ensures relevant information quickly and easy
The links to an ontology and to other similar documents provide means for
better understanding of the topics in a structured way
Pedagogic settings for which the service would be less
Page 258 of 349
D7.4 - Validation 4
Pedagogic setting
suitable:
social learning
essay writing
Reason(s)
the ontology reflects the common sense expert knowledge in a certain domain
there is no support for grading essays
Transferability questionnaire: Relevance of the service in other domains
Types of domain
Reason(s)
Types of domain for which the service would be
suitable:
knowledge oriented domains
each domain requires knowledge supporting resources (ontology, lexicons)
Types of domain for which the service would be less
suitable:
Domains that are more skill oriented than knowledge
oriented are not very suitable for the service
The service requires an ontology to support the semantic search and similarity
measure between documents. If the domain is not appropriate for such
conceptualization, then the service is not appropriate
Page 259 of 349
D7.4 - Validation 4
Section 6: Conclusions
Validation Topics
OVT
Operational Validation Topic
Validated
unconditionally
Validated with
qualifications*
Not
validated
Qualifications to validation
WP6.1 IPP-BAS
PVT1
Verification of accuracy (numeric)
1.1
The high proportion of the learning objects (LOs),
offered by the system, is relevant to the topic,
chosen by the teacher.
PVT2
Tutor efficiency
2.1
The teacher saves time when developing a
course unit in FLSS compared to traditional
means.
IPP-BAS
SU
2.2
The teacher invests fewer efforts (cognitive load)
when developing a course unit in FLSS
compared to traditional means.
IPP-BAS
PVT3
Quality and consistency of info returned by
system
IPP-BAS
Page 260 of 349
SU
IPP-BAS: The result depends
very much on tutor‟s query
approach when searching,
depending on whether the topic is
broad (more reliable) or narrow
(recall drops).
SU: No data – the verification was
performed at IPP-BAS only.
D7.4 - Validation 4
OVT
Operational Validation Topic
Validated
unconditionally
3.1
Teachers perceive that the learning materials
offered by FLSS are useful to them in developing
a course unit.
IPP-BAS
SU
PVT4
Making the educational process transparent
4.1
Using the ontology assists the teacher in
establishing the hierarchy of main concepts
within the course unit.
PVT5
Quality of educational output
5.1
The teacher thinks that the quality of the derived
main structure of a course, together with its
relevant support material, is good.
IPP-BAS
SU
5.2
An advantage of FLSS is that the search can
return learning materials in other languages,
providing teachers with a wider range of
materials for multi-lingual learners
IPP-BAS
SU
PVT6
Motivation for learning
6.1
NOT APPLICABLE
PVT7
Organisational efficiency
7.1
There is a saving in institutional resources overall
PVT8
Relevance
Validated with
qualifications*
Not
validated
Qualifications to validation
IPP-BAS
SU
There is insufficient evidence to
prove that there is a saving in
resources.
Page 261 of 349
D7.4 - Validation 4
OVT
Operational Validation Topic
Validated
unconditionally
8.1
The service meets one or more institutional
objectives
IPP-BAS
SU
PVT9
Likelihood of adoption
9.1
Users were motivated to continue to use the
system after the end of the formal validation
activities
IPP-BAS
SU
9.2
A high score was obtained in the generic
questionnaires (based on UTAUT: likelihood of
adoption).
IPP-BAS
SU
9.3
Tutors attending a dissemination workshop give
high scores to the question 'how likely are you to
consider adopting the service in your own
educational practice?
SU
9.4
Teachers and managers are motivated to adopt
the system, because it suggests multilingual
search.
IPP-BAS
SU
Validated with
qualifications*
Not
validated
IPP-BAS
Qualifications to validation
IPP-BAS: Unclear of the extent to
which individual tutors attending
would use FLSS educationally (rather
than in research), though future use in
SINUS noted
Exploitation (SWOT Analysis)
The objective you are asked to consider is: "FLSS (v1.5) will be adopted in pedagogic contexts beyond the end of the project".
Strengths
The strengths of the system (v1.5) that would be positive indicators for adoption are:
FLSS saves time in course construction
FLSS supports reuse of learning objects in different tasks and courses
Page 262 of 349
D7.4 - Validation 4
FLSS supports reuse of learning objects in multilingual settings as multilingual retrieval of learning materials is
provided
ontology helps tutors to structure the course
teaching process is optimized because it provides easy search for relevant materials and facilitates the course
creation
Weaknesses
The weaknesses of the system (v1.5) that would be negative indicators for adoption are:
complex interface
although tutors can use the ontology to assist in structuring a course, they cannot build the whole course in FLSS
adding new learning objects addition is not trivial
a lot of initial investment of effort in familiarization with the system
no thematic classification along with ontological is-a model
adoption to another domain requires effort
accuracy of retrieval could be further improved
response time could be improved
Opportunities
The system has potential as follows:
FLSS can be used for evaluating already compiled curricula.
FLSS can effectively maintain the teaching process.
FLSS substantially reduces the tutors‟ time for course preparation, and thus – frees time for research and professional
development of the teaching staff.
FLSS can support semi-automatic generation of metadata, which can be used as additional features for retrieval of
relevant learning material.
FLSS can support other domains and languages (for example, the SQL related courses mentioned by tutors). It is
better suited for all domains, in which the conceptual knowledge (not skills) is the main target in the learning process,
and which have or might have a formalized domain ontology equipped with lexicons, and for all languages, which
have basic NLP tools for initial processing.
Threats
The tutors are not used to working with a complex system. Thus, they need some time and good will to adjust and
Page 263 of 349
D7.4 - Validation 4
discover its advantages.
Specific LMSs are sometimes used in the teaching process, in which, however, this system is not incorporated. Thus,
additional tuning by FLSS staff and willingness by the tutors are required for a smart combination and better results.
The usefulness of the FLSS repository depends very much on the topic, the level of teaching (BA, MA, other), the
pedagogic task, since it is of a limited size and needs updates with respect to the mentioned factors.
In many domains FLSS will not be accepted due to lack of formally expressed conceptual information.
The adoption of FLSS can be delayed due to lack of sufficient quantity of learning materials.
Page 264 of 349
D7.4 - Validation 4
Overall conclusion regarding the likelihood of adoption of FLSS Version 1.5:
FLSS was well-received at the pilot sites and tutors found two aspects of FLSS helpful in developing a course: the retrieval of multilingual learning
materials and the ontology for structuring the course. These aspects led to savings in tutor time, a major institutional driver. Teaching managers
at both pilot sites would consider further use of the software, for (1) teaching and (2) evaluation of curricula. The pilot sites noted that they would
like to test FLSS in various contexts prior to the institutional adoption. It takes time for the people to see the full potential of FLSS, since they
usually rely on already popular and worldwide famous architectures (such as Moodle and ILIAS).
Adoption beyond these sites is dependent on further dissemination and exploitation activities, as well as the availability of a suitable repository of
learning objects and ontology. Experience at SU showed that FLSS is not intuitive to new users, so work on improving the user interface will be
important for extending adoption. Further extension of FLSS to internalise the entire course creation process within FLSS would also be helpful.
Overall, our conclusion is that the software v1.5 meets a real need in an effective way, though with a very small repository and for a restricted area
of the IT domain. The cost of small scale pilots (as proposed at IPP-BAS and SU for further exploration) is likely to be prohibitive for most
institutions/courses. The effort involved in setting up and maintaining the repository of learning materials and possibly the ontology would suggest
that FLSS can only be adopted widely for course construction where there are national or international initiatives to fund these activities.
Most important actions to promote adoption of FLSS:
Technical
make the user interface more intuitive
internalise the process of course development within FLSS
further improve the accuracy of retrieval of learning materials
the response time could be improved
optimize the NLP processing module
make FLSS compatible with the LMS architecture so it can interoperate with institutional VLEs
Dissemination and exploitation
establishing of user groups in the two sites of the validation with regular workshops to discuss problems related to concrete usage of the
system, extension of ontologies, lexicons and the repository (see the next point)
provide more use cases in order to make explicit its full potential
provide scenario and use case in another (sub)domain (SQL, for example)
Page 265 of 349
D7.4 - Validation 4
provide exhaustive guidelines on the FLSS exploration
organize dissemination activities and attract interested parties
Organisational
create a mechanism for enriching the repository, ontology, lexicons and grammars in collaboration with other educational institutions or at
national/international level.
providing help to the users for creation of resources (ontologies, learning objects, etc) for new domains
Usability
providing more interactivity between the system and the stakeholder.
enriching the explicit information over the learning material (relations among concepts and terms; statistics of concept occurrences over a
group of documents, etc.)
Page 266 of 349
D7.4 - Validation 4
Section 7 – Road map
Based on the results and conclusions from validation, the LTfLL team has agreed that the following are the most important future
enhancements to the system in order to meet stakeholder requirements:
Most important:
1. to make the process of structuring of a course with augmented LOs internal to FLSS. Now only the facilities of browsing the ontology and
retrieve the relevant material are manageable inside.
2. to show aggregated statistics over not only one document, but also over the learning objects selected by the stakeholder
3. to enrich the visual representation (for example, bigger windows for manipulation, more explicit connections among the data)
4. to allow more interventions and interactivity on the user‟s side (for example, adding concepts in the ontology or exchanging comments on the
data)
5. to localize the interface in Bulgarian (with a possibility for other languages different from English).
Other:
to reduce the ontology upload time when working with bigger ontologies
to implement data architectures for more use cases (for example, to upload another ontology with a related lexicon)
to make FLSS compatible to the LMS architecture
Based on the results and conclusions from validation, the LTfLL team has agreed that the following are the most important changes to
the current scenario(s) of use in order to meet stakeholder requirements:
Most important:
1. to organize more gradual acquaintance with FLSS for the stakeholders before starting its exploration.
2. to provide several possible templates of using FLSS, apart from the main one, which is: first, consulting the ontology and then searching for the
relevant LOs. Possible cooperation with psycholinguists and designers of user interfaces is envisaged here.
3. providing evidence that the scenario might be parameterized for different tasks (basic or advanced course, etc.) and in more subdomains
4. to build the scenario into a real-life architecture (for example to test in on a whole course, not just on a course unit)
5. to set up a full multilingual architecture for at least two languages (interface, rich lexicon, search).
Other:
Page 267 of 349
D7.4 - Validation 4
a clear procedure for automatic and manual support of the users should be designed and tested by FLSS team.
presentational partitioning is to be done into Introduction, Illustrations, Exercises, etc.
Based on the results and conclusions from validation, the LTfLL team has agreed that the following are possible additional educational
contexts for future deployment:
Most important:
1. Educational institutions and other interested parties in Europe should be attracted to contribute to a larger and well-structured in sub-domains
repository, based on ASP, and also to provide resources for other languages. Initially, the European CLARIN infrastructure is considered for such
cooperation.
2. Courses for advanced students in IT at Universities can be developed.
3. FLSS can be tested in various sub-domains (web design, editing, presenting, etc.)
4. FLSS can be introduced in the high schools (especially 10 and 11 grades, where a curriculum in IT is followed)
5. FLSS can be used for evaluating of already designed curricula in other educational projects.
Based on the results and conclusions from validation, the LTfLL team has agreed that the following are the most important issues for
future technical research to enable deployment of language technologies in educational contexts:
Most important:
1. A more stable synchronization among the various language resources is envisaged in the interface - ontologies, lexicons, LOs. For example,
there is no visible connection between the concept in the ontology and the concept occurrences within LOs. Also, users get confused by the
chaotic and cumulative suggestions for using the various facilities.
2. More efforts have to be invested into the optimization of the NLP processing pipes with respect to a domain and a language. There should be a
mechanism for adjusting the existing tools to work together in a pipe. Also, the addition of a new (sub)domain or language should be simplified.
3. The integration or compatibility of stand-alone systems into bigger architectures (such as LMS-es) needs further investigation. When the
possible interaction between such systems is clearly defined in advance for the stakeholders, the adoption in various educational contexts would
become more likely.
4. The overall behaviour of the system should be considered. For example, what time parameter is acceptable for the user to upload some
material and processes it, or to wait for the result from the query; how many error messages and unexpected bugs are tolerable by the user; what
set-up is more intuitive for handling the task, etc.
5. Still in FLSS there is useful information that cannot be explored, because it is hidden (i.e. without visualization). Stakeholders would like to see
more explicit relations among the data and the resources. It will be handled via by more statistics parameters, and partly by graphical means.
Page 268 of 349
D7.4 - Validation 4
Roadmap - validation activities
Further validation planned for beyond the end of the project:
Claim (OVT): Teachers perceive that the learning materials offered by FLSS are useful to them in developing a whole course.
Methodology: 1. questionnaire, 2. comparison of the developed courses to state-of-the-art ones, 3. estimation by a manager on the quality of the
course, 4. test of the developed courses in a real setting (analysis on students‟ opinions)
Objective (OVT): OVT 2.1 The teacher saves time when developing a course unit in FLSS compared to traditional means.
Methodology: 1. estimating the time two tutors invest: the first one using the materials and facilities of FLSS, and the other – the net or other
sources.
Page 269 of 349
D7.4 - Validation 4
Appendix B.7 Validation Reporting Template for WP6.2 (PUB-NCIT & UU)
Section 1: Functionality implemented in Version 1.5 and alpha / beta-testing record
Brief description of functionality
Version
number of unit
Changes from Version 1.0
Knowledge Discovery – Los
V1.5
Provide scientific documents from Bibsonomy in addition to materials from social
network sites
Knowledge Discovery – Los
V1.5
Indicate were documents come from (e.g. Delicious, SlideShare, YouTube)
Knowledge Discovery – Los
V1.5
Dynamic (instead of static) social data retrieval (Delicious, YouTube, SlideShare,
and Bibsonomy)
Knowledge Discovery – Los
V1.5
Disambiguated search results on the basis of the ontology
Ontology visualisation
V1.5
Show shortest path from concept1 to concept2
Ontology visualisation
V1.5
Make ontology fragment dynamic
Ontology Enrichment
V1.5
Disambiguation integrated in ontology enrichment
Knowledge Discovery – LOs / Ontology
visualisation / Social learning - LOs
V1.5
Combined Social and Semantic Search service
Definition finder
V1.5
Reduced length of definitions
People finder
V1.5
Show how people are related
Help Functionality
V1.5
Written a Quick Start Guide
Scalability of the software
V1.5
Improved scalability to enable working with large groups of people
Crawler
V1.5
Distributed crawler instead of serial crawler
Page 270 of 349
D7.4 - Validation 4
Facebook and Twitter crawlers
V1.5
Implemented two additional crawlers for crawling social network data*
Caching
V1.5
Implemented caching for ontology requests , search, and recommendation
Alpha-testing
Pilot site and language
PUB-NCIT (English)
Date of completion of alpha testing:
30/09/10
Who performed the alpha testing?
Vlad Posea
Pilot site and language
UU (English)
Date of completion of alpha testing:
14/10/10
Who performed the alpha testing?
Thomas Markus, Eline Westerhout, Paola Monachesi
Beta-testing
Pilot site and language: PUB-NCIT (English)
Has stakeholder validation taken place using the service embedded in Elgg (Yes/No/Partially):
yes
If ‘No’ or ‘Partially’, give reasons:
The widgets were embedded in Moodle – the learning environment used in PUB-NCIT
beta-testing performed by:
Costin Chiru (tutor), Radu Vasiliu – learner
beta testing environment (stand-alone service / integrated into Elgg):
th
HANDOVER DATE:
5 October 2010
(Date of handover of software v.1.5 for validation)
Page 271 of 349
widgets embedded into Moodle
D7.4 - Validation 4
Pilot site and language: UU (English)
Has stakeholder validation taken place using the service embedded in Elgg (Yes/No/Partially):
Partially
If ‘No’ or ‘Partially’, give reasons:
Knowledge Discovery Service: stand-alone, because the software had to be integrated into the learning
environment used within the UU course (WebCT), which was easier with the stand-alone version.
Social Learning Service: the Elgg-widgets have been used.
beta-testing performed by:
Erna Kotkamp (tutor)
beta testing environment (stand-alone service / integrated into Elgg):
Learning services, both integrated in WebCT
HANDOVER DATE:
21 October 2010
(Date of handover of software v.1.5 for validation)
Page 272 of 349
Stand-alone Knowledge Discovery service and Elgg Social
Section 2: Validation Pilot Overview
NB Information about pilot sites, courses and participants has been transferred to Appendix A.3
Pilot task
Pilot site: PUB-NCIT
Pilot language: English
What is the pilot task for learners and how do they interact with the system?
The pilot was embedded in the Human-Computer Interaction Course at the Politehnica University of Bucharest. Participation in the experiment was
obligatory for all students.
The course consists of 3 hours of presentation and 2 hours of labs weekly. The lab tasks aim to produce outputs, like scripts, HTML/Javascript
interfaces or XML Schemas. The software has been embedded in Moodle, the learning management system that was used for this course. The
software was made available to the students for more than one month. The software was first presented to the students and then made available for
them to use it. The students were recommended to use the software to look for additional materials or to visualize the concepts in the domain. They
mostly used the software in the lab or at home while they were solving their assignments. They used the iFLSS to find learning materials to
supplement the ones already provided by the teaching team inside the course management system.
The link to the software was placed near the link to the official course documentation to encourage learners to use it. The students clicked the link to
find additional documentation and searched for documents in the tutors‟ social networks and visualized the concepts using the knowledge discovery
service. After reading the documentation and listening to a short presentation in the class the students had to carry out small tasks. If the students
had problems dealing with the tasks or with the iFLSS they could ask a tutor for help.
What do the learners produce as outputs? Are the outputs marked?
Learners produced outputs from the lab tasks, like scripts or XML Schemas. These outputs are marked manually – the lab activity is 15% of the total
mark, which made the labs in which the experiments was performed to value 5% of the total course mark. The characteristics of the labs are that they
are not very difficult, but they require a big amount of work in a small amount of time.
Note that because the learners used the software within a real task contributing to their course mark, all learners had to be treated in the same way,
so we could not use a control group.
How long does the pilot task last, from the learners starting the task to their final involvement with the software?
1 month
How do tutors/student facilitators interact with the learners and the system?
Page 273 of 349
D7.4 - Validation 4
During the labs, the tutors present a technology, like XML, XML Schema or Javascript to the students. The tutors also offer support to the students
while solving the tasks.
In order to maximally profit from the software (social learning service), the tutors also have to add content to their accounts on social networking
sites. This task has to be done a few days before the labs, so that the iFLSS could index these documents.
Describe any manual intervention of the LTfLL team in the pilot:
The previous validation revealed that learners do not use social networking sites for bookmarking purposes very often. Therefore, for the Social
Learning service, we used the tutors‟ social networking accounts for searching instead of those of the learners.
Pilot site: UU
Pilot language: English
What is the pilot task for learners and how do they interact with the system?
The validation has been integrated in an existing ICT course at the Humanities faculty. Around 200 students followed this course, and this group was
divided into eight groups of 20-25 students. The validation was carried out with two of these groups. Participating in the experiment was obligatory for
all students in these groups.
Within the course, the students spent three weeks learning HTML and CSS. They used the iFLSS for different tasks, such as finding documents,
finding people, identifying relationships between topics, and learning how a domain is structured. The software has been embedded in the WebCT
environment used in the course.
What do the learners produce as outputs? Are the outputs marked?
In the course, the students have to build their own web page. The course schedule and assignments were fixed already and we were not allowed to
change this. We therefore designed some additional assignments for the experiment groups in which the students were explicitly asked to use the
iFLSS. The outputs are not marked.
How long does the pilot task last, from the learners starting the task to their final involvement with the software?
3 weeks
How do tutors/student facilitators interact with the learners and the system?
Each week there was a hands-on session, in which the tutors introduced some new concepts and assisted the students when necessary. The
students finished their assignments during these sessions. The tutors used the system to introduce new topics using the ontology and the learning
materials. They also pointed the students to the system when they had questions that could be answered using the system.
Page 274 of 349
D7.4 - Validation 4
Describe any manual intervention of the LTfLL team in the pilot:
The previous validation showed that students generally do not use the social networking sites that are used in the iFLSS system very often. We
therefore decided to employ the network of an LTfLL member instead of letting them create their own accounts.
Experiments
Pilot Site: PUB-NCIT
Name of experiment: Ranking
Objective(s):
Verification of social search
Details: We asked 25 students to rate the results returned by the social search on 5 terms:
- 1 term that the tutor didn‟t explicitly use in his social network,
- 1 term used as a tag by the tutor but that had alternative spellings for its tags and produced errors,
- 1 term that the tutor explicitly used as tag, and had no alternative spellings
- 2 terms of their own choice.
The first 5 results for each search were re-ranked by the students. If the student considered a result should not be in the first five results, he marked
the result with an X. The results of the experiment are presented in OVT 1.1.
Learners were also asked to consider whether the search returns people relevant to the search topic. They were asked to perform queries on the
same five terms as above (3 given terms and 2 terms of their own choice) and to decide for the first 5 results whether they are relevant. The results
of this experiment are presented in OVT 1.2.
Pilot Site: UU
Name of experiment: Relevance of learning materials recommended by the knowledge discovery service
Objective(s): Investigate whether learning materials retrieved by the system are relevant.
Details: Since the semantic search based on the ontology is especially useful for disambiguating terms, the experiment focuses on ambiguous
queries (e.g. python, java). From the complete set of ambiguous terms from the ontology, we have manually selected 32 queries, which were
indicative for searches a student typically will encounter. For each of these queries, the relevance of the first 20 results has been judged by two
domain experts. For more details on this experiment, we refer to Markus et al. (submitted).
Page 275 of 349
D7.4 - Validation 4
Pilot Site: UU
Name of experiment: Logging usage statistics
Objective(s): Measuring the motivation to keep using the system after the pilot
Details: The usage of the system during the validation sessions and outside of the sessions has been logged. The system has been used by 49
students during the validation. 26 out of these 49 students (53.1%) have opened the system outside the sessions as well. In addition, 13 other
students who did not participate in the experiment, have accessed the software.
Page 276 of 349
D7.4 - Validation 4
Section 3: Results - validation/verification of Validation Topics
OVT:
1.1
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic
The social learning service provides a high proportion of learning materials that match the search topic
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology:
A group of 25 students judged the results for 5 queries
Results:
The percentage of results learners considered useful were as follows:
Term 1 (xml; in course materials but not tagged by tutor): mean 58% (SD=27.38%)
Term 2 (xmlschema/XML Schema: tagged by tutor under one of two alternative spellings): mean 35% (SD=25.35%)
Term 3 (xpath: tagged by tutor): 85% (SD=21.81%)
Terms 4 & 5 (chosen by learners): 62% (SD=33.40%)
OVT:
1.1
Pilot site
UU
Pilot language
English
Operational Validation Topic
The knowledge discovery service provides a high proportion of learning materials that match the search topic and are
suitable as learning materials
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: The top-20 results on 32 ambiguous queries have been judged by two domain experts
Results:
Contrary to regular search engines, the knowledge discovery service is able to deal with ambiguous queries because it is based on an ontology. The
majority of learning materials retrieved for ambiguous queries match the search topic and are suitable as learning materials (recall: 0.77, precision:
0.75). It should be noticed that the search methods provided within the social network itself (i.e. the APIs) performed considerably worse on the
same queries (precision: 0.22, which means that only 1 out of 5 results is relevant). The recall cannot be measured in this case, since we do not
know the number of relevant resources on the Internet.
Page 277 of 349
D7.4 - Validation 4
OVT:
1.2
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic
The social network service suggests a high proportion of people relevant to the search topic
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology:
Students search for relevant people related to five queries (similar to OVT1.1, PUB-NCIT)
Results:
Relevance is in this context defined as relevant to the search query, i.e. useful people to ask questions about this topic. The percentage of results
learners considered useful were as follows:
Term 1 (xml; in course materials but not tagged by tutor ): mean 92% (SD=23.52%)
Term 2 (xmlschema/XML Schema: tagged by tutor under one of two alternative spellings): mean 63% (SD=21.38%)
Term 3 (xpath: tagged by tutor): 84% (SD=25.32%)
Terms 4 & 5 (chosen by learners): 85% (SD=30.12%)
Learners had been asked to examine the first five results. However, it was noted that some queries did not return five results if the tutor's network
contained less than five people relevant to the search topic.
OVT:
1.3
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic
The average learner's social network has enough people in it who can help him
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology:
Analysis of networks to find number of relevant connections
Results:
An experiment in which the networks of 11 tutors were analysed revealed that tutors have access to 41 relevant persons on average. From this
experiment, it is not clear which of these connections would be actually available for support, like chat or e-mail, but the documents they publish or
bookmark can be used as learning materials for the students. The experiment investigated the accounts of 11 tutors on 5 platforms: Delicious.com,
Page 278 of 349
D7.4 - Validation 4
YouTube.com, Flickr.com, Twitter.com and SlideShare.net. We examined both the initial structure of the tutors‟ networks and the networks‟ evolution
in time:
Tutors examined: 11
People in the tutor‟s networks: 456
Average number of resources posted in the network:47/day
Average relations created in the network: 63/day
1
More details on this experiment can be found in Stoica et al. (2010) .
OVT:
2.1
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic
Tutors have to spend less time finding relevant learning materials and helping the learner to identify related concepts
Summative results with respect to validation indicator
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
Agree
n=
Tutors
KD34. I have to spend less time finding relevant learning materials when
I use the Knowledge Discovery Service.
Experimental
2.5
0.58
0
4
Tutors
KD35. I have to spend less time to identify concepts related to
the course topics when I use the Knowledge Discovery Service
Experimental
2.5
0.58
0
4
Tutors
SL34. I have to spend less time finding relevant learning materials when I
use the Social Learning Service.
Experimental
3.33
0.58
33
3
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
“It tells me stuff I already know.” (about the KD)
Tutors
“I gain time only after investing some time in it.” (SL)
1
Anamaria Stoica, Vlad Posea, Cristina Scheau and Mihai Teleru. An Analysis of the Usage of Social Networking Web Sites for Learning Purposes.
Page 279 of 349
D7.4 - Validation 4
OVT:
2.1
Pilot site
UU
Pilot language
English
Operational Validation Topic
Tutors have to spend less time finding relevant learning materials and helping the learner to identify related concepts
Summative results with respect to validation indicator
Questionnaire
type
Questionnaire no. & statement
Experimenta
l / control
group
Mean
Standard
deviation
%Agree /
Strongly
Agree
n=
Tutors
KD34. I have to spend less time finding relevant learning materials when
I use the Knowledge Discovery Service.
Experimental
3.67
0.58
66.7
3
Tutors
KD35. I have to spend less time to identify concepts related to
the course topics when I use the Knowledge Discovery Service
Experimental
3.67
0.58
66.7
3
Tutors
SL34. I have to spend less time finding relevant learning materials when I
use the Social Learning Service.
Experimental
3.33
1.52
33.3
3
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
“The knowledge discovery service provides better results than Google, because the results are filtered”. I.e. It searches on the
basis of the bookmarks from other people who considered it relevant information.
Tutors
“The usefulness of the social learning services depends on your network.”
Page 280 of 349
D7.4 - Validation 4
OVT:
2.2
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic
There is less cognitive load for the tutor to help the learners to find relevant learning materials and to help the
learner to identify related concepts
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
Agree
n=
Tutor
KD11a. Please rank on a 5-point scale the mental effort you invested to
accomplish teaching tasks using the Knowledge Discovery service
(1=Very High, 5=Very low).
Experimental
4
0.82
n.a.
4
Tutor
KD11b. Overall, using the knowledge discovery system requires
significantly less mental effort to complete my teaching tasks than when
using Google.
Experimental
2.25
0.96
0
4
Tutor
KD36. The cognitive load for finding relevant learning materials to be
used in the course is lower when I use the system
Experimental
2.75
0.96
0
4
Tutor
KD37. The cognitive load for identifying concepts related to the course
topics is lower when I use the system
Experimental
2.75
0.96
0
4
Tutor
SL11a. Please rank on a 5-point scale the mental effort you invested to
accomplish teaching tasks using the Social Learning service (1=Very
high, 5= Very low).
Experimental
3.67
0.58
n.a.
3
Tutor
SL11b. Overall, using the social learning system requires significantly
less mental effort to complete my teaching tasks than when using
Google.
Experimental
2
1
0
3
Tutor
SL35. The cognitive load for finding relevant learning materials to be
used in the course is lower when I use the system
Experimental
2.33
0.58
0
3
Formative results with respect to validation indicator
Stakeholder type
Results
Tutor
“It‟s hard to compare the system with Google given that it is very familiar to us, as it is to students, and it has also specialized
tools like Google Scholar.”
Page 281 of 349
D7.4 - Validation 4
OVT:
2.2
Pilot site
UU
Pilot language
English
Operational Validation Topic
There is less cognitive load for the tutor to help the learners to find relevant learning materials and to help the learner
to identify related concepts
Summative results with respect to validation indicator
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
Agree
n=
Tutor
KD11a. Please rank on a 5-point scale the mental effort (1 = very high
mental effort; 5 = very low mental effort) you invested to accomplish
teaching tasks using the Knowledge Discovery service.
Experimental
3.33
0.58
n.a.
3
Tutor
KD11b. Overall, using the knowledge discovery system requires
significantly less mental effort to complete my teaching tasks than when
using Google.
Experimental
3.67
0.58
66.7
3
Tutor
KD36. The cognitive load for finding relevant learning materials
to be used in the course is lower when I use the system.
Experimental
3.67
0.58
66.7
3
Tutor
KD37. The cognitive load for identifying concepts related to the course
topics is lower when I use the system.
Experimental
3.67
0.58
66.7
3
Tutor
SL11a. Please rank on a 5-point scale the mental effort (1 = very high
mental effort; 5 = very low mental effort) you invested to accomplish
teaching tasks using the Social Learning service.
Experimental
3.33
1.15
n.a.
3
Tutor
SL11b. Overall, using the social learning system requires significantly
less mental effort to complete my teaching tasks than when using
Google.
Experimental
3.33
1.15
66.7
3
Tutor
KD36. The cognitive load for finding relevant learning materials to be
used in the course is lower when I use the system.
Experimental
3.67
1.15
33.3
3
Formative results with respect to validation indicator
Stakeholder type
Results
Tutors
“It would be useful if the tutor could have total control over the knowledge discovery service (i.e. possibility to change and
Page 282 of 349
D7.4 - Validation 4
manipulate content, ontology, etc.) Its use will be affected by how difficult it is to manipulate it.”
Tutors
“Both services could be improved by giving tutors the option to modify the content." The tutors specified what they would like to
be able to change: "adapt the ontology for a specific course by marking the course concepts in a different color" and "allowing
tutors to add more information about learning materials.”
Tutors
“The social learning service should allow tutors to categorize the resources, to provide comments about the learning materials.”
OVT:
3.1
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic
The learners judge the learning materials provided by the system as being relevant for their learning task
Summative results with respect to validation indicator
Questionnaire
type
Questionnaire no. & statement
Experimenta
l / control
group
Mean
Standard
deviation
%Agree /
Strongly
Agree
n=
Learners
KD7. The learning materials provided by the knowledge discovery
service are relevant for my learning task.
Experimental
3.3
0.91
35
20
Learners
SL7. The learning materials provided by the social learning service are
relevant for my learning task.
Experimental
2.9
0.83
28
25
Formative results with respect to validation indicator
Stakeholder type
Results
Learner
The results are not oriented on problem solving. “If I want to find the solution for a very specific bug I can‟t find anything useful
here”
Learner
“When I have to do something quick I often don‟t have time to watch training videos – I go directly to an example”
Learner
“I would rather click the link returned by the (social) search if I‟d see more information about it in the window (like Google search
results)”
Learner
“If the teaching assistant bookmarked this it must be good”
Page 283 of 349
D7.4 - Validation 4
OVT:
3.1
Pilot site
UU
Pilot language
English
Operational Validation Topic
The learners judge the learning materials provided by the system are relevant for their learning task
Summative results with respect to validation indicator
Questionnaire
type
Questionnaire no. & statement
Experimenta
l / control
group
Mean
Standard
deviation
%Agree /
Strongly
Agree
n=
Learners
KD7. The learning materials provided by the knowledge discovery
service are relevant for my learning task.
Experimental
3.2
0.93
42.9
35
Learners
SL7. The learning materials provided by the social learning service are
relevant for my learning task.
Experimental
3.32
0.91
47.1
35
Learners
G7. The learning materials provided by Google are relevant for my
learning task.
Control
3.4
0.71
29.4
18
Formative results with respect to validation indicator
Stakeholder type
Results
Learner
The learning materials provided by the social learning service are appreciated because “they are more trusted than the Google
results”. Students like the idea that “you can see which documents their tutor trusts” (More about this in OVT3.8)
Learner
It is useful to be able to find learning materials from various sources, especially the YouTube videos are appreciated by the
learners. The learners remarked that it depends on the learning context whether videos are wanted: “W hen you are solving a
task, you do not want to spend time watching a video, it is better to have a document in such a situation. But a video might be
useful when you study.”
Learner
“More information would help to decide whether documents are relevant, for example something like Google has, a short text
which contains the term [comment UU: student meant the result snippets].”
Learner
Google generally suggests “easier documents, which is often enough to find an answer to questions.”
Page 284 of 349
D7.4 - Validation 4
OVT:
3.2
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic
The learners judge the people proposed by the social network service as being relevant
Summative results with respect to validation indicator
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
Agree
n=
Learners
SL8. The people proposed by the social learning service are relevant.
Experimental
3.6
1.12
56
25
Formative results with respect to validation indicator
Stakeholder type
Results
Learners
“I don‟t know who these people are, so I don‟t know if the persons are relevant”.
Learners
“Certainly a feature I miss from search engines: finding persons that are dedicated on posting documents on what I‟m interested
in”
OVT:
3.2
Pilot site
UU
Pilot language
English
Operational Validation Topic
The learners judge the people proposed by the social network service as being relevant
Summative results with respect to validation indicator
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
Agree
n=
Learners
SL8. The people proposed by the social learning service are relevant.
Experimental
3.29
0.68
38.2
35
Formative results with respect to validation indicator
Stakeholder type
Results
Learners
In the validation context, the network of an LTfLL member was used, which influenced the results: “I didn't know the people, so
how do I know whether they are relevant?”
Learners
Positive experience with documents from a certain person was used for future search requests: “I found a good document from
this person for my previous request, so other bookmarks from the same person are probably also interesting.”
Page 285 of 349
D7.4 - Validation 4
Learners
OVT:
3.3
[Learners appreciated the idea of using their tutor's network, which in a university context is a realistic alternative to using the
students' own networks.]
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic
Learners trust the retrieved learning materials more than those found by traditional means
Summative results with respect to validation indicator
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
Agree
n=
Learners
KD9. I trust the retrieved learning materials from the knowledge
discovery service more than those found by Google.
Experimental
2.5
1.39
35
20
Learners
SL9. I trust the retrieved learning materials from the social learning
service more than those found by Google.
Experimental
2.5
1.23
20
25
Formative results with respect to validation indicator
Stakeholder type
Results
Learner
“I‟m very used to searching with Google and I trust them very much. I find it amusing that you are trying to compare with them”
Learner
“I trust resources recommended by my teacher more than I trust resources recommended by Google. Sometimes they are the
same”
Page 286 of 349
D7.4 - Validation 4
OVT:
3.3
Pilot site
UU
Pilot language
English
Operational Validation Topic
Learners trust the retrieved learning materials more than those found by traditional means
Summative results with respect to validation indicator
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
Agree
n=
Learners
KD9. I trust the retrieved learning materials from the knowledge
discovery service more than those found by Google.
Experimental
3.17
1.25
54.3
35
Learners
SL9. I trust the retrieved learning materials from the social learning
service more than those found by Google.
Experimental
3.21
1.02
39.4
35
Formative results with respect to validation indicator
Stakeholder type
Results
Learner
“My tutor will probably only follow good researchers.”
The table for OVT 4.1 is missing for PUB-NCIT, because they did not include this question in their questionnaire.
OVT:
4.1
Pilot site
UU
Pilot language
English
Operational Validation Topic
Learners can independently identify gaps in their knowledge in a given domain and learn how concepts are related
to each other
Summative results with respect to validation indicator
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
Agree
n=
Learners
KD40. The knowledge discovery software helps me to identify gaps in my
knowledge and to learn how topics are related to each other.
Experimental
3.12
0.81
35.3
35
Page 287 of 349
D7.4 - Validation 4
Formative results with respect to validation indicator
Stakeholder type
Results
Learners
“When you are a beginner in the domain you don't know anything. This makes it difficult to know where to start.”
Tutors
The tutors clearly distinguish between different groups of students: “The knowledge discovery service is appropriate for doers,
whereas it might be less appropriate for students that are more insecure, they might feel better with a text that offers a clearer
path.”
OVT:
4.2
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic
The visual representation of the domain helps learners to understand the domain better compared to Google
Summative results with respect to validation indicator
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
Agree
n=
Learners
KD8. Because of the visual representation of the domain I now
understand the domain better than when I would have used Google.
Experimental
3.2
1.47
45
20
OVT:
4.2
Pilot site
UU
Pilot language
English
Operational Validation Topic
The visual representation of the domain helps learners to understand the domain better compared to Google.
Summative results with respect to validation indicator
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
Agree
n=
Learners
KD8. Because of the visual representation of the domain I now
understand the domain better than when I would have used Google.
Experimental
2.71
1.2
25.7
35
Page 288 of 349
D7.4 - Validation 4
OVT:
4.3
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic
The visual representation of the domain helps learners to understand the domain better than they would have
without this visualization.
Summative results with respect to validation indicator
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
Agree
n=
Learners
KD22. The visual representation of the domain helped me to learn more
about the topics covered in the course than I would have without this
visualisation.
Experimental
4.1
0.94
70
20
Formative results with respect to validation indicator
Stakeholder type
Results
Learners
“I can very rapidly discover the keywords and concepts in a domain.”
Learners
“I feel the relations between concepts are not very clearly expressed”
Learners
“It is very good as an introduction to the topic, but I feel that after I know the domain a bit it doesn‟t offer anything special”
Learners
“The visual representation of the domain allows me to learn by myself”
OVT:
4.3
Pilot site
UU
Pilot language
English
Operational Validation Topic
The visual representation of the domain helps learners to understand the domain better than they would have without
this visualization.
Summative results with respect to validation indicator
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
Agree
n=
Learners
KD22. The visual representation of the domain helped me to learn more
about the topics covered in the course than I would have without this
visualization.
Experimental
3.12
0.88
38.2
35
Page 289 of 349
D7.4 - Validation 4
Formative results with respect to validation indicator
Stakeholder type
Results
Learners
“The visualisation does not contain enough information about the relations between the terms, which makes it difficult to
understand the domain on the basis of the graph.”
Learners
“How can I see what I should know when I don't know anything yet?”
Learners
“Not useful to search, it is not clear how you search, all these terms are not known and you don't know what to click.”
Learners
“There is too much information, we don't need it, we don't need/want to learn more, we simply want to learn what we need to
pass the test and we want to figure it out quickly. No distinction is made in the graph.“
Learners
“I have a lot of experience with using mind maps, which are quite similar to the graphs. I like the visualisation.”
Tutors
The information overload in the visualisation was mentioned several times in the interviews: “It makes it difficult to see what is
most relevant”. A suggestion provided was: “give the topics that are obligatory for the course a different colour to help students”.
The course was still running when we wrote the VRT. The quality of the educational output is only measured at the end of the course, when the students had
to hand in their assignments (PUB-NCIT) or to do an exam (UU). We therefore have no results regarding the influence of the iFLSS on the quality of the
eductional output.
OVT:
6.1
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic
The learners perceive that the iFLSS supports more self-directed learning compared to traditional means
Summative results with respect to validation indicator
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
Agree
n=
Learners
KD26. The knowledge discovery service supports more self-directed
learning compared to other tools I use.
Experimental
3.5
0.89
40
20
Learners
SL25. The social learning service supports more self-directed learning
compared to other tools I use.
Experimental
3.0
1.02
28
25
Formative results with respect to validation indicator
Page 290 of 349
D7.4 - Validation 4
Stakeholder type
Results
Learners
“The visual representation of the domain allows me to learn by myself”
OVT:
6.1
Pilot site
UU
Pilot language
English
Operational Validation Topic
2
The learners perceive that the iFLSS supports more self-directed learning compared to traditional means.
Summative results with respect to validation indicator
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
Agree
n=
Learners
G25. Google supports more self-directed learning compared to other
tools I use.
Control
3.3
0.85
17.7
17
Learners
KD26. The knowledge discovery service supports more self-directed
learning compared to other tools I use.
Experimental
3.09
0.79
29.4
35
Learners
SL25. The social learning service supports more self-directed learning
compared to other tools I use.
Experimental
3.03
0.87
32.4
35
Formative results with respect to validation indicator
Stakeholder type
Results
Learners
“The software provides an overview of the domain, this can be helpful for reflection purposes.”
Learners
“I do not have experience with learning in settings outside of the university.”
2
The interviews revealed that this question was not clear to all students and that they do not have much experience with self-directed learning.
Page 291 of 349
D7.4 - Validation 4
OVT:
7.1
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic
There is a saving in institutional resources overall
Formative results with respect to validation indicator
Results that inform a change to the scenario or developed software, or inform the implementation/exploitation plan
Stakeholder type
Results
Teaching
manager
“The software might bring gains in the time the professors spend creating teaching materials”
OVT:
7.1
Pilot site
UU
Pilot language
English
Operational Validation Topic
There is a saving in institutional resources overall
Summative results with respect to validation indicator
Formative results with respect to validation indicator
Results that inform a change to the scenario or developed software, or inform the implementation/exploitation plan
Stakeholder type
Results
Teaching
manager
“The set-up costs may be relatively high in the beginning, but on the long-term there will be a saving in resources. “ The TM
specifically mentioned the time needed to find and tune a new ontology and to build your network.
OVT:
8.1
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic
The service meets one or more institutional objectives
Formative results with respect to validation indicator
Stakeholder type
Results
Teaching
manager
The teaching manager agreed on this statement.
“The system helps our students to access diverse learning materials while doing research”
Page 292 of 349
D7.4 - Validation 4
OVT:
8.1
Pilot site
UU
Pilot language
English
Operational Validation Topic
The service meets one or more institutional objective
Formative results with respect to validation indicator
Stakeholder type
Results
Teaching
manager
The teaching manager agreed on this statement. The system meets several institutional objectives:
assist learners in understanding how a domain is structured
allow learners to learn from professionals
the learners can easily identify qualitative learning materials
OVT:
9.1
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic
Users are motivated to keep using the system after the end of the validation activities
3
Questionnaire
type
Questionnaire no. & statement
Experimenta
l / control
group
Mean
Standard
deviation
%Agree /
Strongly
Agree
n=
Learners
KD35. I would like to use the knowledge discovery service after the pilot.
Experimental
3.6
0.94
55
35
Learners
KD36. If the knowledge discovery service is available after the pilot, I will
definitely use it.
Experimental
3.3
1.21
45
35
Learners
SL34. I would like to use the social learning service after the pilot.
Experimental
3.4
1.19
52
25
Learners
SL35. If the social learning service is available after the pilot, I will
definitely use it.
Experimental
3.0
1.14
36
25
Tutors
KD29. I would like to use the knowledge discovery service in my teaching
after the pilot.
Experimental
3.75
0.5
66
3
3 At PUB-NCIT, no logging has been used to measure the use after the pilot.
Page 293 of 349
D7.4 - Validation 4
Tutors
KD30. If the knowledge discovery service is available after the pilot, I will
definitely use it in my teaching.
Experimental
3.25
0.5
33
3
Tutors
SL29. I would like to use the social learning service in my teaching after
the pilot.
Experimental
3.0
1.0
33
3
Tutors
SL30. If the social learning service is available after the pilot, I will
definitely use it in my teaching.
Experimental
3.33
0.58
33
3
Formative results with respect to validation indicator
Stakeholder type
Results
Learners
[Some students showed curiosity towards the application. The others prefer to use systems with which they are already familiar
if the advantage of using the new one isn‟t overwhelming.]
Learner
“I would like to use this tool (KD service) at the beginning of a course to see quickly which concepts are covered in this course.”
Learner
“I would like to play with this (SL service) from time to time as an alternative to classical search but I can‟t give up the traditional
search methods”
Tutors
“The KD service is much easier to maintain (requires much less effort for me) and that is why it is much more likely for me to
adopt it”
OVT:
9.1
Pilot site
UU
Pilot language
English
Operational Validation Topic
Users are motivated to keep using the system after the end of the validation activities
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: (1) learners & tutors / questionnaire (2) students / logging usage statistics
Results: The usage of the system during the validation sessions and outside of the sessions has been logged. The system has been used by 49
students during the validation. 26 out of these 49 students (53.1%) have opened the system outside the sessions after the pilot has ended as well. In
addition, 13 other students who did not participate in the experiment have accessed the software. These data seem to go against the opinions of the
students on the questionnaires, on which only 20% indicated that they would be interested in using the system again.
Page 294 of 349
D7.4 - Validation 4
Questionnaire
type
Questionnaire no. & statement
Experimenta
l / control
group
Mean
Standard
deviation
%Agree /
Strongly
Agree
n=
Learners
KD35. I would like to use the knowledge discovery service after the pilot.
Experimental
2.56
0.99
20
35
Learners
KD36. If the knowledge discovery service is available after the pilot, I will
definitely use it.
Experimental
2.49
1.05
22
35
Learners
KD39. I am motivated to keep using the knowledge discovery system as
long as it is provided in WebCT.
Experimental
2.76
0.96
26.7
35
Learners
SL34. I would like to use the social learning service after the pilot.
Experimental
2.68
0.88
19.4
35
Learners
SL35. If the social learning service is available after the pilot, I will
definitely use it.
Experimental
2.62
0.85
16.1
35
Learners
SL38.I am motivated to keep using the social learning system as long as
it is provided in WebCT.
Experimental
3.12
0.98
29.0
35
Tutors
KD29. I would like to use the knowledge discovery service in my teaching
after the pilot.
Experimental
3.00
1.00
33.3
3
Tutors
KD30. If the knowledge discovery service is available after the pilot, I will
definitely use it in my teaching.
Experimental
3.00
1.00
33.3
3
Tutors
SL29. I would like to use the social learning service in my teaching after
the pilot.
Experimental
3.33
0.58
33.3
3
Tutors
SL30. If the social learning service is available after the pilot, I will
definitely use it in my teaching.
Experimental
3.0
0
0
3
Formative results with respect to validation indicator
Stakeholder type
Results
Learners
The interviews made learners think more about the usefulness of the service and one of the students – who filled in a negative
answer in the questionnaire – remarked: “I think I'm going to miss this software in my other courses now I see the potential of it!”
Learner
“The knowledge discovery service is useful for reflection. It enables me to check very quickly whether I understand all terms for
my exam.”
Page 295 of 349
D7.4 - Validation 4
Learner
“The knowledge discovery service can help you to find an original topic for a paper. It now often is the case that everyone
chooses the same topic, but I prefer to investigate a topic that is not very common.”
Learner
“I would follow student X when I had the social learning service. He works hard and always provides good summaries to other
students. He probably reads good articles.”
Tutor
“It can constitute a useful support for students that want to know more, which are often neglected during class “
Tutor
“A lot of time and energy should be invested in setting it up and this might be an obstacle in deciding to use it. It would take more
time than prepare a standard course, but if you have set it up, it seems easier to maintain “
OVT:
9.2
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic
A high score was obtained in the generic questionnaires (based on UTAUT: likelihood of adoption)
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Generic questionnaire - learners
Results:
Descriptive Statistics - Learners
N
Mean
Std. Deviation
Effectiveness
45
3,23
,573
Efficiency
45
3,02
,807
Cognitive Load
45
2,76
,981
Usability
45
3,90
,646
Satisfaction
45
3,39
,870
Facilitating conditions
45
3,62
,698
Self-Efficacy
45
3,39
,810
Behavioural intention
45
3,28
1,069
PUB-NCIT
45
3,39
,514
Valid N (listwise)
45
Page 296 of 349
D7.4 - Validation 4
OVT:
9.2
Pilot site
UU
Pilot language
English
Operational Validation Topic
A high score was obtained in the generic questionnaires (based on UTAUT: likelihood of adoption)
Summative results with respect to validation indicator
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology: Generic questionnaire - learners
Results:
Descriptive Statistics - Learners
N
Mean
Std. Deviation
Effectiveness
34
3,19
,805
Efficiency
34
3,08
,596
Cognitive load
34
2,65
1,098
Usability
34
3,45
,629
Satisfaction
34
2,79
,637
Facilitating conditions
34
3,35
,568
Self-Efficacy
34
3,13
,715
Behavioural intention
34
2,65
,793
UU
34
3,09
,545
Valid N (listwise)
34
Page 297 of 349
D7.4 - Validation 4
OVT:
9.4
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic
Learners find the information provided by the system in addition to the learning materials (e.g. titles, users,
definitions) useful for the task being undertaken.
Summative results with respect to validation indicator
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
Agree
n=
Learners
KD The information provided by the system in addition to the learning
materials (e.g. titles, users, definitions) is useful.
Experimental
3.8
0.91
65
20
SL The information provided by the system in addition to the learning
materials (e.g. titles, users, definitions) is useful.
Formative results with respect to validation indicator
Experimental
3.2
0.87
48
25
Learners
Stakeholder type
Results
Learners
[Several of the learners appreciated the definitions provided by the knowledge discovery service]
Learners
[Several of the learners didn‟t like the tags provided by the learning service. They said they would have preferred some snippets
of what the links refer to]
OVT:
9.4
Pilot site
UU
Pilot language
English
Operational Validation Topic
Learners find the information provided by the system in addition to the learning materials (e.g. titles, users,
definitions) useful for the task being undertaken.
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
Agree
n=
Learners
KD37a. The definitions provided by the knowledge discovery service in
addition to the learning materials are useful.
Experimental
3.21
0.81
35.3
35
Page 298 of 349
D7.4 - Validation 4
Learners
KD37b. The tags provided by the knowledge discovery service in
addition to the learning materials are useful.
Experimental
3.29
0.84
38.2
35
Learners
KD37c. The document titles provided by the knowledge discovery service
in addition to the learning materials are useful.
Experimental
3.26
0.67
35.3
35
Learners
SL36a. The users provided by the system in addition to the learning
materials are useful.
Experimental
3.06
0.7
27.3
35
Learners
SL36b. The tags provided by the system in addition to the learning
materials are useful.
Experimental
3.06
0.81
29.4
35
Formative results with respect to validation indicator
Stakeholder type
Results
Learners
Both tags and titles are relevant: “tags give information about the topics covered in the document and eventually about the type
of document, while a title directly shows what the page is about.”
OVT:
9.5
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic
Learners perceive that they can find learning materials more quickly compared to traditional means.
Summative results with respect to validation indicator
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
Agree
n=
Learners
KD38. With the Knowledge Discovery service, I can find learning
materials more quickly than with Google.
Experimental
2.4
1.19
20
20
Learners
SL37. With the Social Learning service, I can find learning materials
more quickly than with Google.
Experimental
2.3
0.85
8
25
Formative results with respect to validation indicator
Stakeholder type
Results
Learners
Sometimes the software (Social search) is slow and answers the queries slowly
Page 299 of 349
D7.4 - Validation 4
[It‟s easier to get results with a software that you‟re already accustomed to]
Learners
OVT:
9.5
Pilot site
UU
Pilot language
English
Operational Validation Topic
Learners perceive that they can find learning materials more quickly compared to traditional means.
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
%Agree /
Strongly
Agree
n=
Learners
KD38. With the Knowledge Discovery service, I can find learning
materials more quickly than with Google.
Experimental
2.59
0.93
14.7
35
Learners
SL37. With the Social Learning service, I can find learning materials
more quickly than with Google.
Experimental
2.65
0.81
8.8
35
Formative results with respect to validation indicator
Stakeholder type
Results
Learners
[For the social learning service, the selection process may be shorter when the tutor uploads good documents. In such cases it
can be faster than Google.]
Learners
“The time it takes to find the software in WebCT is already more than the time it takes to complete a search request with Google”
Page 300 of 349
D7.4 - Validation 4
Section 4: Results – validation activities informing future changes / enhancements to the system
VALIDATION
ACTIVITY
Partner(s) involved: PUB-NCIT
Service language: English
Additional formative results (not associated with validation topics)
Alpha testing
Major changes identified during alpha testing but not yet implemented are: The path finding service, which shows how
a person is related to the user, was dropped as it generated too many requests and made the server too slow.
Tutor interviews
Tutors would like to be able to edit the ontology to add new concepts that they believe are interesting for the students
Tutors would like to be able to add content directly in the visualisation window (like links with resources for a given
concept)
4
Tutors argued that sometimes the search in the SL returns items completely unrelated to the domain . Maybe those
results should be omitted
Learner focus group 1
KD service
Further explanations regarding the relations between concepts would be nice
A richer variety of relations is needed (e.g. JQuery is-library-for Javascript)
One person suggested introducing some time limit between two clicks, before the user can go to the next concept.
The visualisation should allow filtering by difficulty – for example a beginner doesn‟t need to see terms more suited to
advanced learners from the beginning
Eliminate the “grey” results completely as they cause confusions. (Grey results are considered to be ambiguous by the
Knowledge Discovery disambiguation service)
SL service
Facebook integration was strongly suggested, as it is the social network on which users spend the most of their time
The system (especially the SL service) was criticised as too slow when everybody in the class uses it.
The integration to Moodle should be improved
The links provided do not offer sufficient information to understand if a link is worth visiting or not. Some students
4
Explanation: if there are few documents related and all belong to one person, it also returns other documents belonging to that person
Page 301 of 349
D7.4 - Validation 4
suggested eliminating the tag list (very similar for all items returned by a query) and replacing it with some text from the
page
A problem detected and repaired was that the search was case-sensitive. This wasn‟t discovered in the beta-testing but
with real users.
General
Learners would prefer to see the context in which the relevant concept is used in the document.
Learner focus group 2
(prioritisation of
enhancements)
5
Learners judged that the five most important areas for enhancement of the iFLSS (i.e. clusters) are:
Performance under heavy use (SL)
User interface (SL)
Integration with the PUB-NCIT learning environment (both SL and KD)
Support for new web sites (Facebook) (SL)
Ranking of the learning materials (both SL and KD)
Learners judged that the five most important single improvements that should be made to the system are:
1. More information displayed on the visualisation (see results from focus group 1) (KD)
2. Replacing tags in the social search with snippets showing what the link is about (SL)
3. Integrate the widgets in Moodle and not link to external sites (SL)
4. Search faster (SL)
5. Add Facebook as a source of information (SL)
Teaching manager
interview
The teaching manager considered the applications interesting and useful.
The results returned by the search were considered to be good
The visualisation was considered to be useful
User interface on KD and SL could be improved (access keys, more content offered for search)
Add some help information
5 The iFLSS uses tags instead of the document text to search for relevant documents. This makes the inclusion of snippets problematic.
Page 302 of 349
D7.4 - Validation 4
VALIDATION
ACTIVITY
Pilot partner: UU
Service language: English
Additional formative results (not associated with validation topics)
Alpha testing
Major changes identified during alpha testing but not yet implemented are:
showing relation path (i.e. how is a person related to me?) in social learning service was too slow, therefore we
left this feature out of validation
Beta testing
Tutor interviews
Labels in the ontology sometimes overlap
Improved support for tutors would be appreciated:
option to mark course concepts in different colour to prevent information overload
possibility to change ontology for tutors
include option to add comments to social network resources (e.g. tutor‟s opinion about a resource)
Learner focus group 1
Knowledge discovery service
The problem of information overload needs to be addressed. There is too much information in the knowledge discovery
service for uncertain beginners.
The knowledge discovery service should make clear what the students have to know and what is considered optional.
Note: if this feature would be adapted, a distinction between using the software in formal and informal learning contexts
will be necessary.
Learners would like to have clear information about how the terms are related.
It depends on the learning style of a learner whether or not the knowledge discovery service is useful.
Visually overlapping terms in knowledge discovery service should be avoided.
Social learning service
The interface of the social learning service needs to be improved. For example, the students would like to have snippets
like Google, and more information about relevance within the course.
In addition to showing who bookmarked a learning material, it would be also relevant to know who created it.
In the validation context, the network of an LTfLL member was used, which influenced the results. It is relevant for
learners to indicate clearly the status of a person in the network and how it fits in their study-related social network.
Improve ranking of results.
Page 303 of 349
D7.4 - Validation 4
General
The task determines which types of learning materials are relevant. For example, you don't want to get a video when
you are solving a task, since you do not have the time to watch it. It would be better to have a document in this case,
but a video might be more useful when you study.
Add snippets to the search results.
Offer system via general link instead of the course website to allow quicker access (e.g. http://mycourse.iFLSS.com)
Learner focus group 2
(prioritisation of
enhancements)
Teaching manager
interview
Learners judged that the five most important areas for enhancement of the system (i.e. clusters) are:
1. Interface (especially SL)
2. Link system better to course (i.e. better integration of system in course, instead of providing it as a general system) (both)
3. Accessibility of the system (both)
4. Ranking (both)
5. Determine language of documents (both)
Learners judged that the five most important single improvements that should be made to the system are:
1. Concepts in different colours on basis of topics covered in course (KD)
2. Add more relations in the ontology (KD)
3. Make system easier accessible, not via WebCT (i.e. a general link like http://mycourse.iFLSS.com) (both)
4. Add snippets to search results (both)
Both services can improve the learning process and the New Media institute (note: we interviewed the TM from this
institute) would be willing to try them if they can be easily adapted to other domains
Knowledge discovery service ready to be used
The social learning service should be improved, especially with respect to the interface
Since both services offer different advantages and opportunities, it is not considered necessary to offer an integrated
system, although it would be nice to offer institutions the possibility to combine the two components into one system.
Page 304 of 349
D7.4 - Validation 4
Section 5: Results – validation activities informing transferability, exploitation and barriers to adoption
VALIDATION
ACTIVITY
Partner(s) involved:
PUB-NCIT
Service language:
English
Additional formative results (not associated with validation topics)
Beta testing
Moved the services to an Amazon server because of performance problems. The transfer of the services took about 4
hours because there were a lot of additional software packages that had to be installed (python packages mostly). This
should be solved by generating an executable containing all the packages. It might be interesting to assemble all the
LTfLL services on an Amazon machine so that anyone can start a fully installed server in seconds.
Content has been added to the social network sites
Tutor interviews
Tutors didn‟t believe in the widgetized approach. One of the tutors stated that it could be more difficult for the student to
find information in a lot of different widgets than in one integrated application (window).
The tutorial should also contain some advice on how to use the social networking sites more efficiently – e.g. browser
extensions, tips and tricks on how to tag based on learner‟s searching habits.
Some tutors believe that their colleagues who teach subjects that are not related to web 2.0 wouldn‟t use social
networking sites for learning purposes.
Learner focus group 1
The system ran too slowly when everybody was using the system
The links are good only when the teaching assistants invest time – that probably won‟t happen in all the courses
The service is not so useful for problem-solving tasks
It is easier to obtain results from software with which the learners are familiar
Teaching manager
interview
The teaching manager argued that for the SL service, it would be difficult to convince the teaching assistants to work on
social networking sites and for the KD service the resources returned are not validated by a tutor.
The Teaching Manager was open to considering further integration with Moodle. However, to enable a deeper
integration of the iFLSS into Moodle, this should be thoroughly tested by the person responsible for the Moodle
platform.
The Teaching Manager expressed his concerns regarding the effort for tutors in setting up their networks for the Social
Learning service, and concluded that the Knowledge Discovery service was more likely to be adopted.
Other (please specify)
During verification, it was found that tutor networks sometimes are too small to return five people relevant to a topic.
Page 305 of 349
D7.4 - Validation 4
VALIDATION
ACTIVITY
Partner(s) involved:
UU
Service language:
English
Additional formative results (not associated with validation topics)
Beta testing
The social network content and contacts had to be adapted to the course
Some missing concepts covered in the course had to be included in the ontology
Tutor interviews
Some tutors are negative about disclosing personal information to students, but they suggested as an alternative that it
would be acceptable if one could choose which resources and messages are (not) shared with students. It is not
problematic to disclose your network (i.e. your friends) to students, it might on the other hand have advantages since
they might come across interesting people. However, there might be differences in this respect according to which
generation you belong.
Learner focus group 1
Usefulness of social learning service strongly depends on teachers.
Interesting to follow clever students, while less interesting for following friends.
Only useful in courses in which one has to search for new material, while often the information is already provided by
st
the tutors (note: they were all 1 year students).
Students in general are not willing to try new systems, since they are quite satisfied with Google's results.
The impact of using new software is considered a barrier for learners, who are all experienced Google users. Google
works fine and the learners are satisfied with it, so they do not see the need to use other software.
Another barrier is the fact that the software is integrated in WebCT, which involves additional steps to find the software,
whereas Google is immediately accessible at any time.
Teaching manager
interview
The teaching manager of the New Media institute said that they are eager to try new systems and would be ready to
adopt them as long as tutors can control the software themselves. However, teachers from other institutes believe that
this is not the case at the faculty level, since the system is too innovative and there is no funding that can be invested to
improve the performance.
Page 306 of 349
D7.4 - Validation 4
Transferability questionnaire: Institutional policies and practices
The usage of social platforms for regular student usage is necessary for the Social Learning component and should be allowed and recommended by the
institution. Alternatively the social search service can be deployed with the option to allow students to search within the social network of their tutor without
having to have social networking accounts themselves. This may be an effective compromise where they do gain access to high quality learning material
found in the social network of their tutor. The negative side-effect is however that additional personalisation of the content is not possible. This is the setup
that was pioneered during the validation in order to reduce the requirements for adoption of the software by students.
The use of social networking tools should be encouraged for resource sharing between colleagues in the workplace. Tutors indicated that they would see
great value in social networking with colleagues provided that separate work-only-accounts are facilitated and critical mass is obtained.
Transferability questionnaire: Relevance of the service in other pedagogic settings
Pedagogic setting
Pedagogic settings for which the
service would be suitable:
Reason(s)
This can be used in any environment that fulfils the following characteristics:
student wants to understand a new domain
there is some kind of supervision or direction from a tutor
Social Search: directed learning,
social learning student projects, workbased learning
Semantic
Search:
directed
learning,
self-directed
learning,
problem-based learning, essay writing
the tutor doesn‟t offer full time tutoring and doesn‟t have time to recommend study
materials
both tutor and learner use social platforms
In SDL, the service can be used to learn the most important concepts and relations within a domain
on your own
in DL, the service can be used as an additional learning resource to identify how a concept is related
to other concepts
in PBL, the service can assist students while collaboratively solving problems by providing the expert
view on a domain.
For essay writing, the service can be used to get oriented and to find important or novel concepts
and documents.
The service can be used for reflection and to prepare for exams by providing a clear overview of
course subjects which the students can check his or her proficiency in.
The tutor can employ the tool to give the students the proper context and interrelations of the course
Page 307 of 349
D7.4 - Validation 4
Pedagogic setting
Reason(s)
subjects as they are taught each week.
Pedagogic settings for which the
service would be less suitable:
Social Search - PBL
Social and Semantic Search Revising for exams
Problem based learning is not likely to be helped by the social component as it is not very likely for a
peer to run into the same very specific problem that you run into
Revising for exams – the tools help you find new materials, but does not offer summaries or revision
features
Less suited for students which are very insecure and get confused when confronted with non-linear
approaches to learning.
Increased time pressure as for example with a large number of mandatory assignments to be
completed during the course will force students to stick to conventional means of acquiring information.
Just opening a new system and getting to know it is too much effort compared to conventional means
and not worth it, even if there is a long-term pay-off when using it. This problem even extends to
solutions properly embedded within the institute‟s LMS, because the login-procedure itself is already an
additional boundary to adoption.
Transferability questionnaire: Relevance of the service in other domains
Types of domain
Types of domain for which the
service would be suitable:
learning
Types of domain for which the
service would be less suitable:
Practical assignments
Reason(s)
The services can be used in any learning domain (e.g. mathematics, biology, linguistics), as long as
there is an ontology covering the knowledge of that specific domain and the learner has a network in
which resources on this domain are contained. So, any restrictions are not related to the service itself.
The learning process itself should however not have a strong emphasis on practical assignments
with little attention to the theoretical background behind the task when using knowledge discovery tool.
Page 308 of 349
D7.4 - Validation 4
Section 6: Conclusions
Validation Topics
OVT
Operational Validation Topic
Validated
unconditionally
Validated
with
qualifications
Not
validated
Qualifications to validation
PVT1: Verification of accuracy of NLP tools
OVT1.1
The knowledge discovery system provides a
high proportion of relevant learning materials
that match the search topic (knowledge
discovery)
OVT1.1
The social learning search provides a high
proportion of relevant learning materials that
match the search topic
OVT1.2
The social network service suggests a high
proportion of people relevant to the search topic
OVT1.3
The average learner's social network has
enough people in it who can help him
UU
PUB-NCIT
Assumption: tutor has added
content to the social networking
sites. The search is also sensitive
to the use of different spelling
variants.
PUB-NCIT
Assumption: tutor is connected to
people that post content in the
domains searched by the students
PUB-NCIT
More likely if the learner is
connected to the tutor. Depends
on the tutor being connected to
people that post content in the
domains searched by the students
PVT2: Tutor efficiency
OVT2.1
Tutors have to spend less time finding relevant
learning materials and helping the learner to
identify related concepts
UU
Page 309 of 349
PUBNCIT
Tutors often know 'too much'
already. Especially coverage of
social resource search considered
too low, but this depends on the
D7.4 - Validation 4
OVT
Operational Validation Topic
Validated
unconditionally
Validated
with
qualifications
Not
validated
Qualifications to validation
efforts of the tutor itself.
High set-up costs influence
opinion of tutors. Tutor wants
more influence on the ontology.
OVT2.2
There is less cognitive load for the tutor to help
the learners to find relevant learning materials
and to help the learners to identify related
concepts
UU
PUBNCIT
The system itself is considered not
cognitively demanding. However,
it's hard to beat Google, which is
very common to everyone.
Again, the cognitive load for
identifying concepts is not high,
but the system does not
outperform Google. This may also
depend on the course concepts.
PVT3: Quality and consistency of (semi-)
automatic feedback OR information returned
by the system
OVT3.1
The learners judge the learning materials
provided by the system as being relevant for
their learning task
PUB-NCIT
UU
Results considered equally
relevant as Google results, but
system does not outperform
Google. System considered less
useful for problem-solving tasks.
OVT3.2
The learners judge the people proposed by the
social network service as being relevant
PUB-NCIT
UU
Tested with tutors‟ networks in
validation, still to be tested with
students' own networks. Feedback
to user about relevance of people
should be improved.
OVT3.3
Learners trust the retrieved learning materials
more than those found by traditional means
UU
Page 310 of 349
PUBNCIT
Different results in UU and PUBNCIT. The social learning system
did not expose the trust dimension
D7.4 - Validation 4
OVT
Operational Validation Topic
Validated
unconditionally
Validated
with
qualifications
Not
validated
Qualifications to validation
enough. A small group of early
adopters appreciated and
understood the idea. Probably
these are the ones that are more
familiar with the use of social
networks.
PVT4: Making the educational process
transparent
OVT4.1
Learners can independently identify gaps in their
knowledge in a given domain and learn how
concepts are related to each other
OVT4.2
The visual representation of the domain helps
learners to understand the domain better
compared to Google.
UU
PUB-NCIT
Page 311 of 349
Information overload is a problem:
there's one big gap. Makes it
difficult for beginners, should be
tested with learners that have
more knowledge about the domain
already
UU
Visual representation helped
(OVT4.1 and 4.3), but the system
does not outperform Google when
looking at the means. However, at
both institutions – especially at
PUB-NCIT – there is still a
considerable group of positive
people saying that the system
does outperform Google.
The opinions of the learners also
seem to be related to some extent
to their learning styles. Testing
with groups of people having
different learning styles could
point out whether this indeed is
D7.4 - Validation 4
OVT
Operational Validation Topic
Validated
unconditionally
Validated
with
qualifications
Not
validated
Qualifications to validation
the case.
OVT4.3
The visual representation of the domain helps
learners to understand the domain better than
they would have without this visualization.
PUB-NCIT
UU
Students would have preferred a
longer usage of the service in
different domains to give them
more insights in the usefulness of
the visualisation.
After discussing the visualisation
in the focus groups (UU), students
could more clearly see the
benefits of the visualisation while
learning domains they're
interested in.
PUB-NCIT
The UU learners do not have
experience with self-directed
learning, should be tested with
other learners or informal learning
professionals. The PUB-NCIT
learners are weakly positive,
especially about the use of the
knowledge discovery service for
self-directed learning.
PVT5: Quality of educational output
PVT6: Motivation for learning
OVT6.1
The learners perceive that the iFLSS supports
more self-directed learning compared to
traditional means
PVT7: Organisational efficiency
OVT7.1
There is a saving in institutional resources
overall
The set-up costs may be relatively
high in the beginning, but on the
long-term there will be a saving in
Page 312 of 349
D7.4 - Validation 4
OVT
Operational Validation Topic
Validated
unconditionally
Validated
with
qualifications
Not
validated
Qualifications to validation
resources (UU). The PUB-NCIT
teaching manager agreed that the
software might bring gains in the
time the professors spend creating
teaching materials. However,
there's not sufficient effidence that
there will be a saving in
institutional resources. More
research is needed to draw
conclusions in this respect.
PVT8: Relevance
OVT8.1
The service meets one or more institutional
objectives
PUB-NCIT
UU
The system meets several
institutional objectives, the most
important one according to UU
and PUB-NCIT is that it assists
students to easily access diverse
learning materials in doing
research
PVT9: Likelihood of adoption
OVT9.1
Users were motivated to continue to use the
system after the end of the formal validation
6
activities
PUB-NCIT
UU
6 System remains available until the end of the course
Page 313 of 349
Logging results for students at UU
quite positive, while they were less
positive on the questionnaire.
Learners had problems
generalizing to other domains /
courses and only thought of the
setting they experienced in the
course. Tutors generally more
positive.
D7.4 - Validation 4
OVT
Operational Validation Topic
OVT9.2
A high score was obtained in the generic
questionnaires (based on UTAUT: likelihood of
adoption by users).
OVT9.3
Tutors attending a dissemination workshop give
high scores to the question 'how likely are you to
consider adopting the service in your own
educational practice?
OVT9.4
Learners find the information provided by the
system in addition to the learning materials (e.g.
titles, users, definitions) useful for the task being
undertaken.
OVT9.5
Learners perceive that they can find learning
materials more quickly compared to traditional
means.
Validated
unconditionally
Validated
with
qualifications
Not
validated
PUB-NCIT
UU
Qualifications to validation
Pending (tutor workshop: end of
February)
Many neutral opinions can have
several reasons. Solution could be
to test this point using a scale with
even number of responses to
force users to choose.
PUB-NCIT
UU
PUBNCIT
UU
Barriers: using system takes more
time, because it is new and
embedded in course environment.
Exploitation (SWOT Analysis)
The objective you are asked to consider is: "The iFLSS (v1.5) will be adopted in pedagogic contexts beyond the end of the project".
Strengths
The strengths of the iFLSS (v1.5) that would be positive indicators for adoption are:
Innovative functionalities: graph visualisation of a domain, the user search and the social resource search
System enhances the learning experience: it offers the learner trusted materials and provides an overview of a domain
Time needed to maintain courses after the set-up phase is low
Willingness to use the system: the iFLSS was accessed after the pilot by more than 50% of the Dutch students that
7
used it in the pilot.
7 We only have UU figures on this.
Page 314 of 349
D7.4 - Validation 4
Usability
Weaknesses
The weaknesses of the iFLSS (v1.5) that would be negative indicators for adoption are:
Added value of the iFLSS compared to Google is not clear to all users
Set-up costs: effort required to set up the ontology for a new domain and to find good quality resources and contacts.
System feedback: assessing trust and quality is difficult for documents from 'friends of a friend'
System performance: part of the system is too slow at present to deal with large groups of learners at the same time.
Opportunities
The iFLSS (v1.5) has potential as follows:
Social media are actively employed by students. The iFLSS allows the learner to profit from the knowledge in their
network and adds an educational dimension to the use of social media.
The system supports several aspects of the learning process (reflection, knowledge discovery, identifying relevant
learning materials, finding people)
Additional support for good students: the iFLSS provides easy access to more quality-approved documents than the
standard course materials.
It allows tutors to search learning materials for their students in the networks of colleagues and fellow researchers
Learners have access to each other's bookmarks and can easily see which articles 'good' students use.
The chances of adoption of the iFLSS would improve if the system would be part of the institutional LMS
Threats
The iFLSS (v1.5) has the following threats:
Google is the standard: students are not willing to use other systems with overlapping functionalities
Conservative attitude: tutors may be not ready for integrating their social networks activities in their teaching, which is
necessary for the iFLSS to succeed
Control information: tutors and the teaching manager want to control the information provided to their learners, which is
contrary to the philosophy of social learning as adopted in the iFLSS.
Privacy issues: tutors' contacts may be not willing to publicise their assets
New developments: since the beginning of the project, Facebook has become much more dominant in social
networking, whereas the iFLSS does not currently interoperate with Facebook
Problem-solving support: a common learning context in which students search for non-course materials is the problemsolving context. The iFLSS does not pay attention to this particular learning situation.
Page 315 of 349
D7.4 - Validation 4
Overall conclusion regarding the likelihood of adoption of the iFLSS Version 1.5:
The iFLSS supports the learning process by offering two innovative services, which do not exist in current learning management systems yet. More
specifically, the system has made important contributions towards (1) integrating social search, social media content and social networks in a
learning environment and (2) offering learners a visual overview of a domain through which they have access to socially relevant documents. These
innovative aspects of the software are highly regarded among a small group of users, but cause difficulties of comprehension and understanding for
others, who were not able to see the added value of the iFLSS over existing software. This lack of awareness negatively influences the likelihood of
adoption at this moment. However, we believe that as time goes by the use of social networks in learning will become more common to users and,
as a consequence, the likelihood of adoption of the iFLSS will increase. In addition, users are normally conservative and it requires a long period of
time before they will switch from a known system to an unknown one.
Most important actions to promote adoption of the iFLSS:
Functional: include interoperability with Facebook
Functional: investigate possibilities for support in problem solving
System set-up: provide explanation on how to include new ontologies
System set-up: provide an executable file for transferring the service to new servers
Usability: work on improved feedback to make it easier to assess trust and quality of results from the social resource search
Usability: improve scalability for certain parts of the iFLSS
User group: (1) Investigate whether the service is better directed towards self-directed learners (researchers, tutors, mature adults) rather than young
undergraduates and (2) test the system with a group of early adopters
Support: description of the advantages that the iFLSS has for learning compared to existing systems on the project website
Page 316 of 349
D7.4 - Validation 4
Section 7 – Road map
Based on the results and conclusions from validation, the LTfLL team has agreed that the following are the most important future
enhancements to the system in order to meet stakeholder requirements:
Most important:
1. Include interoperability with Facebook and investigate best use of Facebook with respect to learning. Facebook has evolved rapidly in the last
years and is still growing. There are 500 million active users, each of them having on average 130 friends. Making the system available through
Facebook would lower the threshold for learners.
2. Improve user interface, especially for Social Learning. The learners had difficulties to assess the relevance and difficulty of learning materials
on the basis of the current user interface. System feedback should be improved to tackle this issue.
3. Improve performance under heavy use, especially for social learning. The iFLSS is a prototype and some of its components had performance
problems when dealing with a large number of users at the same time. To make the system useful, this problem needs to be solved, since
learners take Google as the standard, and are not willing to wait for their results much longer than they have to wait when using Google.
4. Implement improved support for tutors. The ontology is now not tailored to a specific course or subdomain, but shows the complete domain,
without making a distinction between course concepts and non-course concepts. Even though the coverage of the ontology is high, there might
be concepts missing. Tutors would like to be able to adapt the ontology to their own needs and asked in the validation for a way to distinguish
between course and non-course concepts and wanted to know how to find and include their own ontology.
5. Develop a personalization mechanism to improve results in ranking. The ranking of the learning materials was often considered not
appropriate. A way to improve the ranking would be to take the profile of the learner into account, allowing him to search documents on his own
level.
Other:
Link system better to course (i.e. better integration of system in course, instead of providing it as a general system)
Enable learners to interact with the system (e.g. adding/adapting content, offering feedback)
Based on the results and conclusions from validation, the LTfLL team has agreed that the following are the most important changes
to the current scenario(s) of use in order to meet stakeholder requirements:
Most important:
1. The system is currently less appropriate for courses where self-directed learning is not required, e.g. courses where tutors want to control the
material presented to the learner.
Based on the results and conclusions from validation, the LTfLL team has agreed that the following are possible additional
educational contexts for future deployment:
Most important:
Page 317 of 349
D7.4 - Validation 4
1. Courses where development of independent / self-directed learning is an intended learning outcome. The iFLSS is useful in this setting, since
it offers learners access to information and documents that go beyond the standard course materials.
2. Situations in which tutors wish to find learning materials for their courses from trusted sources. The tutors can easily access learning materials
from fellow tutors and researchers who teach comparable courses at other institutions.
3. Courses where learners are encouraged to find materials from people more expert than themselves. E.g. learners could follow 'good'
students to see which learning materials these students use.
4. The ontology fragment provides learners a concise overview of a domain. It is considered useful for reflection purposes: do I know what I
should know? And which topics should I look at?
5. Exploit the power of successful social media sites such as Facebook to promote use.
Based on the results and conclusions from validation, the LTfLL team has agreed that the following are the most important issues for
future technical research to enable deployment of language technologies in educational contexts:
Most important:
1. Language Technologies need to take the learner context (i.e. his level of conceptual development on a topic) into account in order to
determine the appropriateness of new resources. Research in this direction should look at the possibility to develop a (1) personalised difficulty
estimator for new resources triggered by specific concepts and or tags and (2) and improved method for ranking results on the basis of the
learners' knowledge
2. Integration of conventional, ontology-based and social searches.
Page 318 of 349
D7.4 - Validation 4
Roadmap - validation activities
Further validation planned for beyond the end of the project: Testing in an informal learning environment
Objective (OVT): Assessing whether knowledge discovery supports independent knowledge acquisition in informal learning contexts
Methodology: Validating the software in an informal learning context
Further validation planned for beyond the end of the project: Using students' own networks
Objective (OVT): Results are considered more trustworthy when the users know the people who have bookmarked / uploaded them.
Methodology: Identify group of people that are actively using social networks and validate the system with them
Further validation planned for beyond the end of the project: Testing with groups of people having different learning styles
Claim (OVT): Learners with visual learning styles appreciate the visualisation of the knowledge discovery service more than non-visual oriented
learners
Methodology: Use
an
existing
questionnaire
to
determine
the
learning
style
of
learners,
such
as
http://www.engr.ncsu.edu/learningstyles/ilsweb.html, and investigate whether there is a relation between the learning style and the learners'
opinions about the software
Page 319 of 349
D7.4 - Validation 4
Appendix B.8 Validation Reporting Template for Long Thread (OUNL, AURUS & PUB-NCIT)
Section 1: Functionality implemented in Version 1.5 and alpha / beta-testing record
Brief description of functionality
Version
number of unit
Changes from Version 1.0
LongThread
v1.0
Based on the existing LTfLL services the data transfer has been implemented.
PenSum
v1.5
Changed to an English version
Conspect
v1.5
Changed the domain to IT
Alpha-testing
Pilot site and language
OU, English
Date of completion of alpha testing:
2 February 2011
Who performed the alpha testing?
Katja Bülow, Debra Harris
Beta-testing
Pilot site and language: PUB-NCIT, English
Has stakeholder validation taken place using the service embedded in Elgg (Yes/No/Partially): yes
If ‘No’ or ‘Partially’, give reasons: The services has been embedded in Elgg
beta-testing performed by: Traian Rebedea
beta testing environment (stand-alone service / integrated into Elgg): integrated into Elgg
Page 320 of 349
D7.4 - Validation 4
HANDOVER DATE:
3 February 2011
Pilot site and language: OUNL, English
Has stakeholder validation taken place using the service embedded in Elgg (Yes/No/Partially): yes
If ‘No’ or ‘Partially’, give reasons: The services has been embedded in Elgg
beta-testing performed by: Slavi Stoyanov, Adriana Berlanga
beta testing environment (stand-alone service / integrated into Elgg): integrated into Elgg
HANDOVER DATE:
3 February 2011
Page 321 of 349
D7.4 - Validation 4
Section 2: Validation Pilot Overview
NB Information about pilot sites, courses and participants has been transferred to Appendix A.3
Pilot task – learner pilot
Pilot site:
PUB-NCIT
Pilot language: English, Rumanian
What is the pilot task for learners and how do they interact with the system?
The long thread combines 4 different LTfLL services, and pilots were run as workshops because it was not practical to run full pilots for the
following reasons: (1). a real task covering the LT workflow would take much time (more than could be provided by the learners), (2); it would
require a training in advance of the individual services and their combinations and (3) it would require a substantial input of the learners at each
step. The learners were asked to follow exactly the process flow designed and to use predefined data to investigate the threading mechanism and
to use the different services into more depth. The task has been specified in the following scenario: “…learners search for and received relevant
learning materials through the iFLSS service. Then learners can study them and write a synthesis, to indicate the extent to which they have
understood the content (PenSum). The synthesis is an input for CONSPECT to detect automatically the relationships between concepts in a
concept map format. The workflow of the long thread includes additional support from iFLESS, which provide resources collected from social
media. Finally, the teacher can determine topics for a group chat in PolyCafe as the discussion can be analysed to identify important issues and
level of participation of the learners.” The main objective for this validation was: Measuring pedagogic effectiveness and efficiency of the long
thread.
What do the learners produce as outputs?
Their opinions about the Threading approach, a quality check (formative validation) on the Long Thread by using our LT questionnaire and by
generating ideas to change the software as well as to propose additional threads. The focus was mainly on formative (qualitative) data, but
descriptive quantitative data from the questionnaire was also collected.
How long does the pilot task last, from the tutors starting the task to their final involvement with the software?
Only one part of a day to get introduced, to do the hands-on exercises and to provide feedback.
How do tutors/student facilitators interact with the learners and the system?
The learners did follow the different steps in the thread to provide them a concrete feeling about the Long Thread. Specific groups of 4 learners
have been instructed to go into more details for the service attributed to the group. There they did use the service (iFLSS to explore the resources,
Pensum to write/analyse a synthesis, Conspect to analyse the concepts covered in the synthesis and PolyCAFe to discuss about the different
Page 322 of 349
D7.4 - Validation 4
services) to do their subtasks.
Describe any manual intervention of the LTfLL team in the pilot:
The data input (including output from a preceding service in the work flow) had been predefined to enable the groups to explore the steps, which
are located later in the flow.
Pilot task – teachers/TEL experts
Pilot site:
OUNL
Pilot language: English
What is the pilot task for teachers/experts and how do they interact with the system?
Two events have been conducted at the OUNL. Event 1: A focus group with tutors/teachers was organised to discuss the concept of threading and
the long thread. They did not interact with the LTfLL services and the long thread, the information required for a proof of concepts was provided by
the LTfLL team using a presentation. In this presentation a very concise description of the functionality of the LTfLL services and the concepts of
threading had been given, the Long Thread had been explained as a possible educational use for such threading and finally the aims of the focus
group with the method to be used . After the introduction they have been asked to give comments and to generate ideas about possible benefits,
weakness and obstacles for adoption of long threads and secondly to fill in a questionnaire. The main objectives for this validation event were: a)
Identifying benefits, weakness and obstacles for adoption of the long thread concept and b) Measuring pedagogic effectiveness and efficiency of
the long thread. Event 2: a walkthrough and focus group with technology-enhanced learning experts was organised to get acquainted with the
Long Thread. They did use the thread description and instruction on the LTfLL server to get hands-on experiences. In this exercise they did use
preloaded data. The task has been specified in the same scenario as the learners at PUB-NCIT: “…learners search for and received relevant
learning materials through the iFLSS service. Then learners can study them and write a synthesis, to indicate the extent to which they have
understood the content (PenSum). The synthesis is an input for CONSPECT to detect automatically the relationships between concepts in a
concept map format. The workflow of the long thread includes additional support from iFLESS, which provide resources collected from social
media. Finally, the teacher can determine topics for a group chat in PolyCafe as the discussion can be analysed to identify important issues and
level of participation of the learners.” The main objectives for this validation event were: b) Measuring pedagogic effectiveness and efficiency of
the long thread and c) Investigating possible improvement to the current version of the long thread and informing the LTfLL roadmap.
What do the learners produce as outputs?
Their opinions about the Threading approach, a quality check (formative validation) on the Long Thread by using our LT questionnaire and by
generating ideas to change the software as well as to propose additional threads. A further clustering of the comments of event 1 had been
conducted by the Long Thread Validation Team. The focus was mainly on formative (qualitative) data but a quantitative analysis was also
conducted (descriptive statistics and cluster analysis)
How long does the pilot task last, from the tutors starting the task to their final involvement with the software?
Page 323 of 349
D7.4 - Validation 4
2 ½ hours to get introduced, to do the hands-on exercises and to provide feedback.
How do tutors/TEL experts interact with the learners and the system?
For event 1: the tutors did not interact with the Long Thread and the services. For event 2: the TEL-experts did follow the flow of the Long Thread.
Describe any manual intervention of the LTfLL team in the pilot:
The data input (including output from a preceding services) had been preloaded to enable the demonstration (event1) and walkthrough (event2).
Page 324 of 349
D7.4 - Validation 4
Section 3: Results - validation/verification of Validation Topics
OVT:
1.1
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic: The integration with a step-wise access and data transfer functions in the Long
Thread.
Summative results with respect to validation indicator:
Experimental results, with stakeholders involved and brief methodology
Stakeholders / methodology:
Results:
In general the data transfer between the different services work as has been proven in the two hands-on validation events.
Formative results with respect to validation indicator
Stakeholder type
Results
TEL-experts
Two challenges will be taken in the future: the automatic processing of the search results from iFLSS by PenSum requires that
the provided URL directs to a textual document. Whenever the search result refers to more complex sites (including
navigation, menus) PenSum has problems finding the main textual body.
OVT:
2.1
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic: The teacher saves time and resources by using the long thread.
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
Learners
The combination of language technology services saves time
Experimental
3.40
0.96
Page 325 of 349
%Agree /
Strongly
agree
n=
25
D7.4 - Validation 4
OVT: 2.1
Pilot site
OUNL
Pilot language
English
Operational Validation Topic: The teacher saves time and resources by using the long thread.
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
Teachers
The combination of language technology services saves time
Experimental
2.13
0.99
9
TEL-experts
The combination of language technology services saves time
Experimental
2.38
0.52
8
OVT:
2.2
Pilot site
PUB-NCIT
Pilot language
English
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
Learners
Using the combined language technology services seems to require little efforts.
Experimental
3.04
1.24
Pilot site
OUNL
Pilot language
English
n=
Operational Validation Topic: There is less cognitive load required to use the Long Thread.
Questionnaire
type
OVT: 2.2
%Agree /
Strongly
agree
%Agree /
Strongly
agree
n=
25
Operational Validation Topic: There is less cognitive load required to use the Long Thread.
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
Teachers
Using the combined language technology services seems to require little efforts.
Experimental
2.00
0.76
9
TEL-experts
Using the combined language technology services seems to require little efforts.
Experimental
2.00
0.76
8
Formative results with respect to validation indicator
Page 326 of 349
%Agree /
Strongly
agree
n=
D7.4 - Validation 4
Stakeholder type
Results
TEL-experts
I do not believe in making things easier by providing scenario.
OVT:
3.1
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic: The combination of language technology services would be useful for my
learning.
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
Learners
The combination of language technology services would be useful for my
learning.
Experimental
3.96
0.79
OVT:
3.1
Pilot site
OUNL
Pilot language
English
%Agree /
Strongly
agree
n=
25
Operational Validation Topic: The combination of language technology services would be useful for my
teaching.
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
Teachers
The combination of language technology services would be useful for my
teaching.
Experimental
2.00
1.07
9
TEL-experts
The combination of language technology services would be useful for my
teaching.
Experimental
4.13
0.64
8
%Agree /
Strongly
agree
n=
Formative results with respect to validation indicator
Stakeholder type
Results
TEL-experts
I can see the value of the individual services quite well, but think that the added value of combining individual services will
emerge after having used them and having become thoroughly familiar with them.
TEL-experts
I‟d love to test these applications in my practice.
Teachers
Is there any proof that language technology can help in teaching
Page 327 of 349
D7.4 - Validation 4
OVT:
6.1
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic: The use of threading and the Long Thread are encouraging the motivation for
learning.
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
Learners
Using combined language technology makes learning more interesting
Experimental
3.96
0.94
OVT:
6.1
Pilot site
OUNL
Pilot language
English
%Agree /
Strongly
agree
n=
25
Operational Validation Topic: The use of threading and the Long Thread are encouraging the motivation for
learning.
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
Teachers
Using combined language technology makes learning more interesting
Experimental
3.63
0.52
9
TEL-experts
Using combined language technology makes learning more interesting
Experimental
2.25
0.89
8
Formative results with respect to validation indicator
Stakeholder type
Results
Teachers
As a teacher I need to know my students and have direct contact; if a lot is in between, I get lost.
Page 328 of 349
%Agree /
Strongly
agree
n=
D7.4 - Validation 4
OVT:
8.1
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic: The flexible combination of services in a thread has a potential to solve
specific educational problems.
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
Learners
Combinations of different language technologies services, when compared with
individual services, would enable new solutions for specific educational
problems.
Experimental
3.92
0.76
25
Learners
The language technology services could be combined in many different ways.
Experimental
3.84
1.07
25
Learners
The combination of language technology services could be used across different
subject matter domains.
Experimental
3.76
1.05
25
%Agree /
Strongly
agree
Formative results with respect to validation indicator
Stakeholder type
Results
Learners
Documenting for an extensive project that needs to be solved in a team (iFLSS, Conspect, PenSum, PolyCAFe)
Learners
Teaming up students by using their compatibilities (abilities) (iFLSS, PolyCAFe)
Learners
Improving iteratively your knowledge (PenSum, Conspect, iFLSS)
Learners
Suggesting resources that were missed from a conversation (iFLSS, Conspect)
Learners
Documenting the start-up of a project (chat, PolyCafe, Conspect, iFLSS)
Learners
Documenting for a bachelor thesis (Conspect, iFLSS, PolyCAFe)
Page 329 of 349
n=
D7.4 - Validation 4
OVT:
8.1
Pilot site
OUNL
Pilot language
English
Operational Validation Topic: The flexible combination of services in a thread has a potential to solve
specific educational problems.
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
Teachers
Combinations of different language technologies services, when compared with
individual services, would enable new solutions for specific educational
problems.
Experimental
3.50
0.76
9
TEL-experts
Combinations of different language technologies services, when compared with
individual services, would enable new solutions for specific educational
problems.
Experimental
3.87
0.35
8
Teachers
The language technology services could be combined in many different ways.
Experimental
3.75
0.71
9
TEL-experts
The language technology services could be combined in many different ways.
Experimental
3.88
0.64
8
Teachers
The combination of language technology services could be used across different
subject matter domains.
Experimental
2.25
0.71
9
TEL-experts
The combination of language technology services could be used across different
subject matter domains.
Experimental
3.25
1.49
8
Formative results with respect to validation indicator
Stakeholder type
Results
Teachers
These combinations of tools seem strongest for high school use or specific higher education fields.
Teachers
Conspect + Pensum + iFLss = good combination.
Page 330 of 349
%Agree /
Strongly
agree
n=
D7.4 - Validation 4
OVT:
8.2
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic: The combination of language technology services would be useful for my
teaching.
Formative results with respect to validation indicator
Stakeholder type
Results
Learners
Useful to all the widgets interacting together if you have a complex task
Tutor
The learners were able to list 6 educational scenarios for threads from their own needs
OVT:
8.2
Pilot site
OUNL
Pilot language
English
Operational Validation Topic: The combination of language technology services would be useful for my
teaching.
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
Teachers
From educational point of view, I see potential in combining different individual
language technology services in one application.
Experimental
4.13
0.64
9
TEL-experts
From educational point of view, I see potential in combining different individual
language technology services in one application.
Experimental
3.88
0.64
8
OVT:
9.1
Pilot site
OUNL
Pilot language
English
%Agree /
Strongly
agree
n=
Operational Validation Topic: Users were motivated to continue to use the system after the end of the formal
validation activities
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
Learners
I would be interested in using the combined language technology services after
this pilot.
Experimental
3.44
0.87
Page 331 of 349
%Agree /
Strongly
agree
n=
25
D7.4 - Validation 4
OVT:
9.1
Pilot site
OUNL
Pilot language
English
Operational Validation Topic: Users were motivated to continue to use the system after the end of the formal
validation activities
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
Teachers
I would be interested in using the combined language technology services after
this pilot.
Experimental
2.63
1.41
9
TEL-experts
I would be interested in using the combined language technology services after
this pilot.
Experimental
4.25
0.71
8
%Agree /
Strongly
agree
n=
Formative results with respect to validation indicator
Stakeholder type
Results
TEL-experts
In general the concepts are nice but there are many usability and technical issues to solve.
OVT:
9.2
Pilot site
PUB-NCIT
Pilot language
English
Operational Validation Topic: The combined language technology services could work well alongside other
software.
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
Learners
The combined language technology services could work well alongside other
software I usually use.
Experimental
3.28
0.89
25
Learners
The combined language technology services could work well alongside individual
services.
Experimental
3.64
0.86
25
Learners
I feel comfortable in using language technology services in combination.
Experimental
3.48
1.05
25
Page 332 of 349
%Agree /
Strongly
agree
n=
D7.4 - Validation 4
OVT:
9.2
Pilot site
OUNL
Pilot language
English
Operational Validation Topic: The combined language technology services could work well alongside other
software.
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
Teachers
The combined language technology services could work well alongside other
software I usually use.
Experimental
3.13
1.25
9
TEL-experts
The combined language technology services could work well alongside other
software I usually use.
Experimental
3.87
0.35
8
Teachers
The combined language technology services could work well alongside individual
services.
Experimental
3.38
0.74
9
TEL-experts
The combined language technology services could work well alongside individual
services.
Experimental
3.75
0.89
8
Teachers
I feel comfortable in using language technology services in combination.
Experimental
3.96
0.94
9
TEL-experts
I feel comfortable in using language technology services in combination.
Experimental
2.63
0.52
8
OVT:
9.3
Pilot site
PUB-NCIT
Pilot language
English
%Agree /
Strongly
agree
n=
Operational Validation Topic: The use of threading has the potential to enlarge the possible user groups.
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
Learners
Combined language technology services when compared with individual services
have added value for education.
Experimental
3.64
0.86
Page 333 of 349
%Agree /
Strongly
agree
n=
25
D7.4 - Validation 4
OVT:
9.3
Pilot site
OUNL
Pilot language
English
Operational Validation Topic: The use of threading has the potential to enlarge the possible user groups.
Questionnaire
type
Questionnaire no. & statement
Experimental
/ control
group
Mean
Standard
deviation
Teachers
Combined language technology services when compared with individual services
have added value for education.
Experimental
3.38
0.74
9
TEL-experts
Combined language technology services when compared with individual services
have added value for education.
Experimental
3.38
0.74
8
Page 334 of 349
%Agree /
Strongly
agree
n=
D7.4 - Validation 4
Section 4: Results – validation activities informing the user requirements for the Threading approach
Case I (Focus group with tutors)
Measurement Instruments
Card sorting and questionnaire were the two measurement instruments used for data collection. Card sorting requires the participants to generate, in the
format of statements, ideas about the benefits, weakness and obstacles for adoption of long threads. The list of statements was then uploaded into a webbased environment and the members of the LT team individually performed sorting of the statements across the three initial groups (benefits, weakness and
obstacles) to identify some additional patterns.
Analysis and results
The data collected is qualitative by nature but apart from content analysis, we also performed some quantitative analysis on it. The participants generated 55
statements. Of them, 14 were Benefits, 22 Weaknesses, 13 Obstacles and 6 Interesting/Suggestions.
The participants like the idea that large amount of text can be handled, feedback is objective, and long threads would have a positive effect on
learners for developing effective learning strategies.
The tutors have some concerns regarding resistance of stakeholders, too much reliance on technology, applicability to different educational context,
quality of outcomes, and that combination may not add value to individual services.
Some possible obstacles, as indicated by the participants, are combined workload, completeness and correctness of results, and acceptance of
“automated” support by stakeholders.
During this first level of analysis, we noticed some recurring issues across the four basic headings, e.g feedback, workload, stakeholders‟ resistance,
relevance to other domains and developing learning strategies. To reveal possible hidden structures in the data the LT team applied card sorting. The list of
all statements was unloaded into websort, a web-based tool supporting card sorting. The LT team did use closed categories, which have provided the
websort results (up to 25% low agreement; 50-75% medium agreement; more than 75 % high agreement). By using the three graphs visualising the results
from cluster analysis and a cluster analysis algorithm, which takes the input from the item-by-item percentage matrix to aggregate in an objective way the
contribution of all sorters, we ended up with eight groups of statements. They are as follows: Learning strategies (5 items), Adoption of stakeholders (5),
Method (11), Implementation (5), Quality of outcomes (6), Other domains uptake (4), Feedback (9), and Workload (10).
1. Learning strategies are about the effect of long threads on students‟ learning. Some of the statements included in this group are: „Students learn to
critically look for feedback‟; „Students will learn to become more independent‟, and „Feedback loop for students‟.
Page 335 of 349
D7.4 - Validation 4
2. Adoption of stakeholders includes statements about barriers for adopting long threads. Examples are: „Willingness of the teachers to use it?
Resistance‟, „Willingness of schools/students to use the tools‟, „Convince students and teachers of value of programs that provide feedback‟.
3. Method consists of some generic statements that refer to threads as a pedagogic approach. Some representative statements are: „Complete,
supplementary method‟, „Combining (doubtful) services might be detrimental to good aspects‟, „Different angles of the approach for educational
problems‟, „Is there any proof that language technology can help in teaching?, „The whole might be more than the sum of the parts‟.
4. Implementation contains items such as „Implementation [is] difficult‟, „Isn‟t a quite great risk to buy this?‟ „A cost benefit analysis is missing‟.
5. Quality of outcomes indicates concerns about capacity of language technologies to provide the same quality of outputs as human experts. Examples
of statements are: „Quality is not the same as quantity‟, „System analyses, so no tutor-dependent‟, „Can the programs detect misconceptions and lack
of understanding/quality?‟.
6. Other domains uptake reflects concerns of the participants about the applicability of long threads to different context, domains and educational levels.
Some statements included in this group are as follows: „These combinations of tools seem strongest for high school use or specific higher education
fields‟, „Not applicable in all types of education‟, „Are the tools applicable to any contexts?‟
7. Feedback, as the name suggests, is about the potential of long thread to give objective and reliable feedback to students and the ability of
stakeholders to read it in a right way. Some examples are: „Feedback is not dependent on an individual tutor‟, „Feedback is objective‟, „The services
lead to unwanted/unproductive bias in all feedbacks‟, „Students and teachers are unable to use it. Do not have skills‟, „Not always reliable feedback‟,
„Not all students are able to critically judge feedback‟, „Risk for students to adapt to wrong feedback‟.
8. Workload is about efficiency of the tool in terms of time and effort spent. Some of the statements included in the cluster are as follows: „Large
amounts of texts can be handled‟, „Reduction of workload for teachers‟, „Overall tutor load should go down‟, „Tools might lead to higher cognitive load
exceeding the gain‟, „Increased work load for students‟, „Tutor still needs to put much time in it‟.
A further interpretation of results suggests some possible combinations between the eight categories, namely: Method (ideas related to thread as a pedagogic
approach), Implementation and Adoption (a combination between the clusters Implementation, Stakeholders adoption and Other domains uptakes),
Feedback and Quality of outcomes (Learning Strategies, Feedback and Quality of outcomes), and Workload.
Provisional results on these categories are:
Method:From the one hand, the participants principally assume that a long thread as a whole might be more than the sum of its parts. A long thread is
a method complementary to the individual services that provides different perspectives on educational problems. From the other hand, the tutors
suspect that combinations of services could be detrimental for individual services.
Feedback and Quality of outcomes. From the one hand, the locus of control is on the system, not on the individual tutor, which makes the feedback
objective. Feedback provided by the tool supports students to develop self-regulated learning. At the same time, concerns are raised as whether
such feedback is reliable, can it match the quality of human experts, and whether stakeholders have the needed skills to deal with the process and
outcomes of such “automated” feedback.
Page 336 of 349
D7.4 - Validation 4
Workload. In general, the participants believe that long threads should reduce the workload of both teachers and students but the current state of the
application does not provide enough evidence that long threads can save time and efforts. On contrary, the participants see many problems in this
respect.
Implementation and adoption. This composed group contains statements that expressed mainly concerns regarding resistance of stakeholders to
adopt long threads, applicability to other context and domains, but also some suggestions for how to increase the likelihood of adoption.
Overall, the results indicate that the participants like some aspects of the idea of long threads (benefits), some other they dislike (weakness) and the
participants have some concerns regarding implementation and adoption (obstacles). More importantly, it turned out that for one and the same issue (e.g.
method, feedback, workload), the tutors had both positive and negative reactions. It is an indication for the complexity of their thoughts as a reflection of the
complexity of the problems they address. This draws a more realistic picture of the perception of the participants on the idea of long threads at a particular
moment. The results from the long thread questionnaire LTQ point out at the same direction.
As can be seen, the highest scores get items such as „From educational point of view, I see potential in combining different individual “language technology
services” in one application‟ (M = 4.13; SD = 0.6); „The “language technology services” could be combined in many different ways‟ (M = 3.75; SD = 0.7);
„Using combined “language technology services” would make learning more interesting‟ (M = 3.63; SD = 0.5); „Combinations of different “language
technologies services”, when compared with individual services, would enable new solutions for specific educational problems‟ (M =.3.5; SD = 0.8). Low
scores get the following items: „The combination of “language technology services” would be useful for my teaching‟ (M = 2; SD =1); „Using the combined
“language technology services” seems to require little effort‟ (M = 2; SD = 0.8); „The combination of “language technology services” seems to save time‟(M =
2.13; SD =1); „The combination of “language technology services” could be used across different subject matter domains‟( M = 2.25; SD = 0.7); „I feel
comfortable in using “language technology services” in combination‟( M = 2.25; SD = 0.9). In general, the tutors evaluate high the potential of long threads to
provide in long term new solutions for educational problem. However, the tutors do not see at the moment, how long threads can help their teaching neither in
term of usefulness, nor in term of efficiency (saving time and efforts).
Based on the formative feedback on the individual services in the Long Thread by the learners we found as main results that:
they appreciate the flexibility of the presented analyses (dynamic conceptogram, zooming possibilities, different ways of analyzing conversations).
they like the highlighting functions for important feedback/results (concepts covered in course, concepts related keywords, the conversation thread
highlight feature).
they did recognise additional potentials for the individual tools, which are not yet covered by their user scenarios (e.g. Conspect to identify concepts
from a course and searching them on Wikipedia or Google, Pensum to summarize a discourse or presentation, PolyCAFe to improve their abilities for
debate, dialogue and collaboration, iFLSS use it as a dictionary/thesaurus for new concepts).
Page 337 of 349
D7.4 - Validation 4
they consider the user interface of a variable quality and not consistent across the services. While Pensum has been praised about its user interface
potentials, it receives also critics about not being user friendly. The other services have also been criticized (the conceptograms are too compact,
difficult to rearrange, indicators do not have very good/useful labels, difficult to read due to font size and underlining).
they see quality issues in the feedback (Inaccessible concepts due to stemming, not neglecting small irrelevant phrases in original texts, starting a
new unrelated paragraph leads always to a coherence error, several not important words are part of the conversation feedback, in the resources from
the social network is a lot of junk besides the relevant ones).
they discovered some stability issues in some of the services (when combining two resources there have been several errors, could not load new
RSS feeds, problems using the conversation thread tabs).
they found find the on-line guidance insufficient for some of the services (there are few indications on how to use it and very few hints, this process is
=in the beginning= not intuitive)
they encountered several usability issues due to the use of many different widgets in the long thread
they witnessed problems with the loading time of the widgets in the long thread (too many of them, too many resources needed)
Page 338 of 349
D7.4 - Validation 4
Section 5: Results – validation activities informing transferability, exploitation and barriers to adoption
No information provided.
Page 339 of 349
D7.4 - Validation 4
Section 6: Conclusions
Validation Topics
OVT
Operational Validation Topic
Validated
unconditionally
Validated with
qualifications*
Not
validated
Qualifications to validation
PVT1: Verification of the Long Thread
OVT1.1
The integration with a step-wise access and
data transfer functions in the Long Thread.
PUB-NCIT
OUNL
PVT2: Tutor efficiency
OVT2.1
The teacher saves time and resources by using
the long thread.
PUB-NCIT
OUNL
OVT2.2
There is less cognitive load required to use the
Long Thread.
PUB-NCIT
OUNL
PVT3: Quality and consistency of (semi-)
automatic feedback OR information returned
by the system
OVT3.1
The combination of language technology
services would be useful for my teaching.
PUB-NCIT
OUNL-TELexperts
OUNLTeachers
PUB-NCIT
OUNL
PVT4: Making the educational process
transparent
OVT4.1
N/A
PVT5: Quality of educational output
OVT5.1
N/A
PVT6: Motivation for learning
OVT6.1
The use of threading and the Long Thread are
Page 340 of 349
Teachers did not have hands-on
experience with the Long Thread
D7.4 - Validation 4
OVT
Operational Validation Topic
Validated
unconditionally
Validated with
qualifications*
OUNL
Not
validated
Qualifications to validation
encouraging the motivation for learning.
PVT7: Organisational efficiency
OVT7.1
PVT8: Relevance
OVT8.1
The flexible combination of services in a thread
has a potential to solve specific educational
problems.
PUB-NCIT
OVT8.2
The use of threading has the potential to
enlarge the possible user groups.
PUB-NCIT
OUNL
PUB-NCIT generated a set of
possible new threads
PVT9: Likelihood of adoption
OVT9.1
I would be interested in using the combined
language technology services after this pilot.
PUB-NCIT
OUNL
OVT9.2
The combined language technology services
could work well alongside other software.
PUB-NCIT
OUNL
OVT9.3
The use of threading has the potential to
enlarge the possible user groups.
PUB-NCIT
OUNL
Strong disagreement between
TEL experts and teachers
Exploitation (Strengths, Weaknesses and Threats Analysis)
The objective you are asked to consider is: "The Long Thread will be adopted in pedagogic contexts beyond the end of the project".
Strengths
LT provides in time feedback of a consistent quality (objective, not tutor-dependent)
LT improves the independency of the learners (including skills to judge the feedback received)
LTfLL services and threading introduces new angles to approach educational problems
LT enables self-directed learning for complex tasks as a complete, supplementary approach for more traditional
Page 341 of 349
D7.4 - Validation 4
learning
Innovative technology and educational approaches
Weaknesses
Fear of bypassing teachers and too much reliance on computers and software
An imbalance between the educational gain and the increased (work)load
Quality issues of the feedback are risky for the learning process
The orientation on textual utterances makes transfer to Maths and “exact” sciences problematic or even impossible.
Value of programs depend on strictly described learning tasks is not suitable for writing essay about topic of choice
Threats
Requires a a very large resource consumption to set up, install the corpus, tutor involvement and to create network
Threads specify a standardised task flow which may conflict with the learning style of the learners
Not all learners are able to critically judge feedback.
A steep learning curve: the LT requires the knowledge and the use of a lot of different tools
Too much widgets may lead to cognitive overload
Overall conclusion regarding the likelihood of adoption of the threading approach:
The concept of threading appears to be useful for the stakeholders in our Long Thread Validation. However, the current version is far away from
being ready to be sold as a stand-alone product. We consider that the “proof of concept” of threading has been accepted. The practical use of
threading has now to be proven in more different educational contexts, but the problems (stability, quality and accuracy issues etc.) of some of the
individual services are a serious risk to successful validation in real contexts. As one of the tutors participating in the Long Thread validation said,
“Combining (doubtful) services might be detrimental to good aspects”. Clearly a prerequisite for more extensive roll-out of the Long Thread is for
additional work to be done at the level of the individual services, for which Version 1.5 is still an intermediate version.
Most important actions to promote adoption of FLSS:
Technical:
Improve the individual services
Improve the data integration and the posssible workflows with their access and connections
Improve the consistency across the services used (interface, use of concepts).
Page 342 of 349
D7.4 - Validation 4
Improve the loading time in particular whenever the LT needs many widgets
Roll out:
Ensure that the potential users get experiences in the use of the different individual LTfLL services to build up their self-confidence.
provide convincing examples of educational threads (including other domains, languages and learning strategies).
Make setting up threads more easy, e.g. outsource substantial parts of the preparation (corpuses and delivering processed data to be
used in the services).
Deliver guidelines and instructions to manage the expectations of the stakeholders
Page 343 of 349
D7.4 - Validation 4
Section 7 – Road map
Based on the results and conclusions from validation, the LTfLL team has agreed that the following are the most important future
enhancements to the system in order to meet stakeholder requirements:
Most important:
Improve the guidance on the screen
Make the interfaces of the different services more user friendly and more consistent across the services
Improve the quality of the results and the feedback (correctness, relevancy) generated within the services
Make a editing environment to enable the creation of threads by end users
Based on the results and conclusions from validation, the LTfLL team has agreed that the following are the most important changes to
the current scenario(s) of use in order to meet stakeholder requirements:
Most important:
Design a scenario for the not Language Technology oriented stakeholders to make threads
Based on the results and conclusions from validation, the LTfLL team has agreed that the following are possible additional educational
contexts for future deployment:
Most important:
Investigate whether the LTfLL approaches match the learning needs and pedagogic approaches at High schools.
Based on the results and conclusions from validation, the LTfLL team has agreed that the following are the most important issues for
future technical research to enable deployment of language technologies in educational contexts:
Most important:
Design the complete set of rules about input and output requirements to be met by the linked services.
Design how the data will be shared
Page 344 of 349
D7.4 - Validation 4
Annex 1: Qualitative data from the learner workshop on threading (PUB-NCIT).
PART 1. Feedback for individual services:
CONSPECT
Strengths:
- Highlighting the concepts that are present in the course documentation;
- Highlighting the concepts that are related to a keyword the student is looking for, and the semantic relationships between the keyword and other concepts
- The conceptogram is dynamic, with a pleasant design that helps in the visualization of the concepts. It is also interactive.
- The existence of the zoom functionality for the conceptogram
Weaknesses:
- The widgets (using iframes) are a very bad idea: messy layout, a lot of imbricated scrollbars, poor usability;
- The conceptograms: the elements are too compact, most of the times several concepts are displayed one over another, thus are difficult to follow;
- Difficult to rearrange concepts in a conceptogram they always try to go to the initial location (due to the force-based layout, maybe?)
- There are few indications on how to use it, very few hints and texts that help the user
- There are no labels for the colors in the conceptogram.
- The language is technical, inaccessible to people in other domains (especially due to stemming, for a lot of words are not easy to determine their original
meaning)
- Very unintuitive user interface (go back buttons, zoom in/out)
- When combining two resources, there have been several errors
- Could not load new RSS feeds
Potential uses of the tool:
- Identification of concepts from a course and searching them on Wikipedia or Google
- Useful as it extracts very easily the most important concepts from different texts and to see the semantic links between them
PENSUM
Strengths:
- Detects relatively well the phrases that were not part of the summary
- Analyzes pretty good different types of summaries (even automatically generated ones)
- The feedback does not take very long
- After getting used to it (although this process is not very intuitive), it is easy to use overall
Page 345 of 349
D7.4 - Validation 4
- The user interface has good potential; it is simple, coherent, good placement of the buttons
Weaknesses:
- The steps required to get a new feedback need to be explained better (not very intuitive to use for the first times)
- For small phrases in the original text (including rhetorical phrases), the feedback always says that they were not covered (although they are irrelevant and
these kind of phrases should be excluded).
- If you start a new paragraph that is not related to the previous one, one will always get an error for the last phrase in the previous paragraph not being
“coherent” with the first phrase in the new paragraph.
- In each phrase, you need to have at least one keyword, otherwise it is labeled as incoherent
- Not very friendly user interface
- Sometimes, the feedback is not relevant or correct
- Seems to use only keywords in order to generate the feedback
Potential uses of the tool:
- For pupils and students that want to write summaries for certain courses.
- For teachers that need to summarize some course materials or papers
- For anyone that would like to summarize a discourse or a presentation
- For teachers that want to “verify”/assess a summary written by a students.
- Interactive lessons for the high-school
- Checking summaries and maybe even anti-fraud detection
POLYCAFE
Strengths:
- Many different ways to analyze a conversation
- The thread highlight feature is useful
- Useful to see the topics that were relevant to the discussion and the missing topics from a discussion
- Useful statistics about the conversation
- A way to find useful/fruitful conversations
Weaknesses:
- The indicators from the participant feedback maybe should be renamed as they do not have very good/useful labels
- Problems using the conversation thread tab in the conversation visualization sometimes
- Cannot make the correspondence from a key concept to a thread (highlight the most important threads given a concept)
- Several not important words (told, yes) are part of the conversation feedback
Page 346 of 349
D7.4 - Validation 4
Potential uses of the tool:
The tool is useful for students because:
- it helps them improve their abilities to take part in a debate, in a dialogue, in a collaboration
- very good for self-evaluation, self-assessment, reflection
- semi-automated feedback for conversations
IFLSS
Strengths:
- Useful links for Youtube
- The suggested persons are really relevant
- Some of the scientific papers are relevant
- Some of the Slideshare presentations are relevant
Weaknesses:
- there would be a need for a concept graph (or similar concepts)
- In the resources from the social network, there is also a lot of junk besides the relevant resources
- duplicates in the scientific papers list
- difficult to read and use because the font is very small and every resource is underlined
Potential uses of the tool:
- use it as a “dictionary” (Traian – maybe thesaurus) for new concepts
- personal learning (especially the Youtube videos)
- finding relevant people to offer you support and information in a given domain
- searching for scientific papers in a given domain
B. weaknesses of the current long thread:
- too many widgets that make it difficult to use (6-7 widgets should be a maximum)
- the widgets are very diverse (in look and feel, in how they work and respond)
- difficult to have a task that requires this current threading scenario
- there are usability issues due to the very high number of widgets, with lots of different information (students seem to get lost)
- students that are not very computer prone, would adapt very difficult to the long thread
C. strengths of the current long thread:
- useful to have all the widgets interacting together if you have a complex task
- innovative technology and approach
Page 347 of 349
D7.4 - Validation 4
- parts of the long thread are very useful
- it is useful to have communication between the widgets
D. conclusions for long thread
- the students opted for using shorter threads (at most 3 tools working together)
- there are some problems with the loading time of the widgets in the long thread (too many of them, too many resources are needed)
- the most useful links between widgets are from Conspect to IFLSS and from PolyCafe to IFLSS
PART 2. Threading ideas:
Idea 1 – Documenting for an extensive project that needs to be solved in a team
1. Use IFLSS to search for relevant articles for the subject that is studied
2. Using Conspect to extract the common concepts from all the relevant resources returned at step 1, plus the links between them
3. Using the concepts detected in step 2 and the files in step 1, make a summary that is verified with Pensum. Each student makes a summary, by using the
starting concepts
4. After that, the members of each team make a chat brainstorming to see what each of them brings new in the summaries for the studied subject. Using
Polycafe to analyze the previous chat, the students choose a project leader.
Idea 2 – Teaming up students by using their compatibilities (abilities)
1. Given a certain subject
2. Use IFLSS to get relevant people for the subject
3. Group them automatically in teams/groups for chat conversations.
4. Analyze each chat conversation with PolyCAFe
5. Determine the people that have the same level of abilities (are compatible) given the subject in order to team up for solving a problem.
Idea 3 – Improving iteratively your knowledge
1. Given a course and the materials to read
2. Write a summary and analyze it with Pensum
3. For the topics that where not covered either:
(3a) locate them with Conspect in an already stored conceptogram
(3b) use IFLSS to read resources (or watch videos) about them
4. return to them 2 and see if the summary has improved.
Idea 4 – Suggesting resources that were missed from a conversation
1. Have a chat conversation (or a discussion thread in a forum)
Page 348 of 349
D7.4 - Validation 4
2. Search the relevant concepts that are missing from the conversation with either:
(2a) IFLSS
(2b) locate them with Conspect in an already stored conceptogram
Idea 5 – Documenting for a project
1. Use a chat for brainstorming before the project
2. analyze the chat results with Polycafe to detect the most important utterances,
3. feed these utterances as input to Conspect. To discover the most important concepts with their connections
4. use IFLSS to find additional resources.
Idea 6 – Documenting for the bachelor thesis
1. Feed the documentation for the thesis topic in Conspect in order to generate the conceptogram for each book and article
2. then combine the conceptograms to get the most common concepts.
3. use IFLSS to find additional resources for the most common and most important concepts
4. document these results together with the original documents.
5. Then have a chat with the tutor (or a master student) and
6. analyze it with Polycafe to see if the student has a good performance (similar to the tutor or master student).
Page 349 of 349