Predicting learning performance using NLP: an exploratory study using two semantic textual similarity methods

Papadimas, C.; Ragazou, V.; Karasavvidis, I.; Kollias, V.

doi:10.1007/s10115-024-02293-2

Predicting learning performance using NLP: an exploratory study using two semantic textual similarity methods

Regular Paper
Published: 13 February 2025

(2025)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

C. Papadimas¹,
V. Ragazou¹,
I. Karasavvidis¹ &
…
V. Kollias²

45 Accesses
Explore all metrics

Abstract

Most learning analytics (LA) systems provide generic feedback, because they primarily draw on performance data based on quiz scores. This study explored the potential of student-generated summaries as an alternative method for predicting learning performance. Two hundred and fifty-four undergraduates first watched a series of six short video lectures and then wrote a short summary for each one. Based on their median performance quiz scores, the participants were divided into two performance groups. Sparse and dense text vectorization methods were used to represent the video lectures and student summaries. Three semantic textual similarity features were computed using cosine similarity and were used as input into seven common machine learning algorithms. The results indicated that the sparse similarity features outperformed dense ones in classifying performance. Also, the best classification accuracy was achieved using the K-Nearest Neighbors and Random Forrest algorithms. Overall, the findings suggest that semantic similarity measures can be used as additional proxy measures of learning, thereby enabling the real-time monitoring and evaluation of student understanding in LA contexts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Student Performance Prediction Model Based on Course Description and Student Similarity

Similarity Based Answer Evaluation in Academic Questions Using Natural Language Processing Techniques

Predicting Course Performance on a Massive Open Online Course Platform: A Natural Language Processing Approach

References

Agarwal SR, Agrawal SB, Latif AM (2015) Sentence formation in NLP engine on the basis of indian sign language using hand gestures. Int J Comput Appl 116(17)
Aggarwal C, Zhai C (2012) Mining text data. Springer Science & Business Media, Berlin
Book MATH Google Scholar
Akçapınar G, Altun A, Aşkar P (2019) Using learning analytics to develop early-warning system for at-risk students. Int J Educ Technol High Educ 16(1):1–20. https://doi.org/10.1186/s41239-019-0172-z
Article MATH Google Scholar
Albano V, Firmani D, Laura L, Mathew JG, Paoletti AL, Torrente I (2023) NLP-based management of large multiple-choice test item repositories. J Learn Anal 10(3):28–44. https://doi.org/10.18608/jla.2023.7897
Article MATH Google Scholar
Almatrafi O, Johri A (2018) Systematic review of discussion forums in massive open online courses (MOOCs). IEEE Trans Learn Technol 12(3):413–428
Article MATH Google Scholar
Arnold KE, Pistilli MD (2012) Course signals at Purdue: using learning analytics to increase student success. In Proceedings of the 2nd International Conference on Learning Analytics and Knowledge, ACM, New York, NY, pp. 267–270.
André M, Mello RF, Nascimento A, Lins RD, Gašević D (2021) Toward automatic classification of online discussion messages for social presence. IEEE Trans Learn Technol 14(6):802–816
Article Google Scholar
Asif R, Merceron A, Ali SA, Haider NG (2017) Analyzing undergraduate students’ performance using educational data mining. Comput & Educ 113:177–194
Article MATH Google Scholar
Baker RS, Gašević D, Karumbaiah S (2021) Four paradigms in learning analytics: why paradigm convergence matters. Comput Educ: Artif Intell 2:100021. https://doi.org/10.1016/j.caeai.2021.100021
Article Google Scholar
Bakharia A (2016) Towards cross-domain MOOC forum post classification. In: Proceedings of the third (2016) ACM conference on learning @ scale, pp 253–256. https://doi.org/10.1145/2876034.2893427
Bangert-Drowns RL, Hurley MM, Wilkinson B (2004) The effects of school-based writing-to-learn interventions on academic achievement: a meta-analysis. Rev Educ Res 74(1):29–58. https://doi.org/10.3102/00346543074001029
Article MATH Google Scholar
Banihashem SK, Noroozi O, van Ginkel S, Macfadyen LP, Biemans HJ (2022) A systematic review of the role of learning analytics in enhancing feedback practices in higher education. Educ Res Rev. https://doi.org/10.1016/j.edurev.2022.100489
Article MATH Google Scholar
Beseiso M, Alzahrani S (2020) An empirical analysis of BERT embedding for automated essay scoring. Int J Advan Comput Sci Appl. https://doi.org/10.14569/ijacsa.2020.0111027
Article MATH Google Scholar
Blake C (2011) Text mining. Ann Rev Inf Sci Technol 45(1):121–155. https://doi.org/10.1002/aris.2011.1440450110
Article MATH Google Scholar
Bodily R, Verbert K (2017) Trends and issues in student-facing learning analytics reporting systems research. In Proceedings of the seventh international learning analytics & knowledge conference (pp. 309–318). https://doi.org/10.1145/3027385.3027403.
Cai C (2019) Automatic essay scoring with recurrent neural network. In Proceedings of the 3rd International Conference on High Performance Compilation, Computing and Communications (pp. 1–7). New York, NY, USA. https://doi.org/10.1145/3318265.3318296.
Carmon CM, Hu X, Graesser AC (2023) Assessment in Conversational Intelligent Tutoring Systems: Are Contextual Embeddings Really Better? In: Wang N, Rebolledo-Mendez G, Dimitrova V, Matsuda N, Santos O (eds) Artificial Intelligence in Education. Posters and Late Breaking Results, Workshops and Tutorials, Industry and Innovation Tracks, Practitioners, Doctoral Consortium and Blue Sky. AIED 2023. Communications in Computer and Information Science, vol 1831. Springer, Cham. https://doi.org/10.1007/978-3-031-36336-8_19
Chapter Google Scholar
Caspari-Sadeghi S (2023) Learning assessment in the age of big data: learning analytics in higher education. Cogent Educ 10(1):2162697. https://doi.org/10.1080/2331186X.2022.2162697
Article Google Scholar
Chen CM, Wang JY, Hsu LC (2021) An interactive test dashboard with diagnosis and feedback mechanisms to facilitate learning performance. Comput Educ: Artif Intell 2:100015. https://doi.org/10.1016/j.caeai.2021.100015
Article Google Scholar
Chi MT, Wylie R (2014) The ICAP framework: linking cognitive engagement to active learning outcomes. Educ Psychol 49(4):219–243. https://doi.org/10.1080/00461520.2014.965823
Article MATH Google Scholar
Chimingyang H (2020). An automatic system for essay questions scoring based on LSTM and word embedding. In 2020 5th International Conference on Information Science, Computer Technology and Transportation (ISCTT) (pp. 355–364). IEEE. https://doi.org/10.1109/ISCTT51595.2020.00068.
Cichosz P (2018) A case study in text mining of discussion forum posts: Classification with bag of words and global vectors. Int J Appl Math Comput Sci 28(4):787–801. https://doi.org/10.2478/amcs-2018-0060
Article MATH Google Scholar
Chui KT, Fung DCL, Lytras MD, Lam TM (2020) Predicting at-risk university students in a virtual learning environment via a machine learning algorithm. Comput Hum Behav 107:105584. https://doi.org/10.1016/j.chb.2018.06.032
Article Google Scholar
Clow D (2012) The learning analytics cycle: closing the loop effectively. In Proceedings of the 2nd international conference on learning analytics and knowledge (pp. 134–138). https://doi.org/10.1145/2330601.2330636.
Condor A, Litster M, Pardos Z (2021) Automatic short answer grading with sbert on out-of-sample questions. Int Educ Data Mining Soc. https://educationaldatamining.org/edm2021.
Conneau A, Kiela D, Schwenk H, Barrault L, Bordes A (2017) Supervised learning of universal sentence representations from natural language inference data. arXiv preprint arXiv:1705.02364.
Corbett AT, Anderson JR (1994) Knowledge tracing: modeling the acquisition of procedural knowledge. User Model User-Adap Inter 4(4):253–327. https://doi.org/10.1007/BF01099821
Article MATH Google Scholar
Crossley S, Paquette L, Dascalu M, McNamara DS, Baker RS (2016) Combining click-stream data with NLP tools to better understand MOOC completion. In: Proceedings of the sixth international conference on learning analytics & knowledge, pp 6–14. https://doi.org/10.1145/2883851.2883931
Daniel BK (2014) Big data and analytics in higher education: opportunities and challenges. Br J Edu Technol 46(5):904–920. https://doi.org/10.1111/bjet.12230
Article MATH Google Scholar
Del Gobbo E, Guarino A, Cafarelli B, Grilli L (2023) GradeAid: a framework for automatic short answers grading in educational contexts—design, implementation and evaluation. Knowl Inf Sys 65(10):4295–4334. https://doi.org/10.1007/s10115-023-01892-9
Article Google Scholar
Dessì D, Fenu G, Marras M, Recupero DR (2019) Bridging learning analytics and cognitive computing for big data classification in micro-learning video collections. Comput Hum Behav 92:468–477. https://doi.org/10.1016/j.chb.2018.03.004
Article MATH Google Scholar
Devlin J, Chang MW, Lee K, Toutanova K (2018) BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
El Aouifi H, El Hajji M, Es-Saady Y, Douzi H (2021) Predicting learner’s performance through video sequences viewing behavior analysis using educational data-mining. Educ Inf Technol 26(5):5799–5814. https://doi.org/10.1007/s10639-021-10512-4
Article Google Scholar
Elouazizi N (2014) Point-of-view mining and cognitive presence in MOOCs: a (computational) linguistics perspective. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 32–37
Explosion (2022) spaCy (3.0). https://spacy.io/
Fahd K, Venkatraman S, Miah SJ, Ahmed K (2022) Application of machine learning in higher education to assess student academic performance, at-risk, and attrition: A meta-analysis of literature. Educ Inf Technol. https://doi.org/10.1007/s10639-021-10741-7
Article MATH Google Scholar
Ferguson R, Clow D (2017) Where is the evidence? A call to action for learning analytics. In Proceedings of the seventh international learning analytics & knowledge conference (pp. 56–65).
Fernández-Delgado M, Cernadas E, Barro S, Amorim D (2014) Do we need hundreds of classifiers to solve real world classification problems? J Mach Learn Res 15(1):3133–3181
MathSciNet MATH Google Scholar
Ferreira-Mello R, André M, Pinheiro A, Costa E, Romero C (2019) Text mining in education. Wiley Interdiscip Rev: Data Min Knowl Discov 9(6):e1332. https://doi.org/10.1002/widm.1332
Article Google Scholar
Fiorella L, Stull AT, Kuhlmann S, Mayer RE (2020) Fostering generative learning from video lessons: benefits of instructor-generated drawings and learner-generated explanations. J Educ Psychol 112(5):895
Article Google Scholar
Fischer C, Pardos ZA, Baker RS, Williams JJ, Smyth P, Yu R, Warschauer M (2020) Mining big data in education: affordances and challenges. Review of Research in Education 44(1):130–160. https://doi.org/10.3102/0091732X20903304
Article Google Scholar
Gaddipati SK, Nair D, Plöger PG (2020) Comparative evaluation of pretrained transfer learning models on automatic short answer grading. arXiv preprint arXiv:2009.01303.
Gašević D, Dawson S, Siemens G (2015) Let’s not forget: learning analytics are about learning. TechTrends 59:64–71. https://doi.org/10.1007/s11528-014-0822-x
Article Google Scholar
Gašević D, Dawson S, Rogers T, Gasevic D (2016) Learning analytics should not promote one size fits all: the effects of instructional conditions in predicting academic success. Internet Higher Educ 28:68–84. https://doi.org/10.1016/j.iheduc.2015.10.002
Article Google Scholar
Gomaa WH, Fahmy AA (2020) Ans2vec: A Scoring System for Short Answers. In: Hassanien A, Azar A, Gaber T, Bhatnagar R, Tolba M (eds) The International Conference on Advanced Machine Learning Technologies and Applications (AMLTA2019). AMLTA 2019. Advances in Intelligent Systems and Computing, vol 921. Springer, Cham. https://doi.org/10.1007/978-3-030-14118-9_59
Chapter MATH Google Scholar
Graesser AC (2013) Prose comprehension beyond the word. Springer Science & Business Media, Berlin
MATH Google Scholar
Graham S, Harris KR, Santangelo T (2015) Research-based writing practices and the common core: meta-analysis and meta-synthesis. Elem Sch J 115(4):498–522. https://doi.org/10.1086/681964
Article MATH Google Scholar
Graham S, Kiuhara SA, MacKay M (2020) The effects of writing on learning in science, social studies, and mathematics: a meta-analysis. Rev Educ Res 90(2):179–226. https://doi.org/10.3102/0034654320914744
Article MATH Google Scholar
Guillot R, Seanosky J, Guillot I, Boulanger D, Guillot C, Kumar V, Fraser SN(2018) Assessing learning analytics systems impact by summative measures. In 2018 IEEE 18th International Conference on Advanced Learning Technologies (ICALT) (pp. 188–190). IEEE.
Guzmán-Valenzuela C, Gómez-González C, Rojas-Murphy Tagle A, Lorca-Vyhmeister A (2021) Learning analytics in higher education: a preponderance of analytics but very little learning? Int J Educ Technol High Educ 18:1–19. https://doi.org/10.1186/s41239-021-00258-x
Article Google Scholar
Hasnine MN, Akcapinar G, Flanagan B, Majumdar R, Mouri K, Ogata H (2018). Towards final scores prediction over clickstream using machine learning methods. In 26th International Conference on Computers in Education Workshop Proceedings (pp. 399–404). Asia-Pacific Society for Computers in Education (APSCE).
Hassan S, Fahmy AA, El-Ramly M (2018) Automatic short answer scoring based on paragraph embeddings. Int J Adv Comput Sci Appl 9(10):397–402
MATH Google Scholar
Hattie J, Timperley H (2007) The power of feedback. Rev Educ Res 77(1):81–112. https://doi.org/10.3102/003465430298487
Article MATH Google Scholar
Hayati H, Chanaa A, Idrissi MK, Bennani S (2019) Doc2Vec & naïve bayes: learners' cognitive presence assessment through asynchronous online discussion TQ transcripts. Int J Emerg Technol Learn 14(8)
Hernández-Lara AB, Perera-Lluna A, Serradell-López E (2021) Game learning analytics of instant messaging and online discussion forums in higher education. Educ+ Train 63(9):1288–1308. https://doi.org/10.1108/ET-11-2020-0334
Article Google Scholar
Hirschberg J, Manning CD (2015) Advances in natural language processing. Science 349(6245):261–266. https://doi.org/10.1126/science.aaa8685
Article MathSciNet MATH Google Scholar
Honnibal M, Montani I (2017) spaCy 2: Natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing. 7(1):411–420 (To appear)
Jarodzka H, Van Gog T, Dorr M, Scheiter K, Gerjets P (2013) Learning to see: guiding students’ attention via a model’s eye movements fosters learning. Learn Instr 25:62–70. https://doi.org/10.1016/j.learninstruc.2012.11.004
Article Google Scholar
Jiang T, Jiao J, Huang S, Zhang Z, Wang D, Zhuang F, Wei F, Huang H, Deng D, Zhang Q (2022) PromptBERT: Improving BERT sentence embeddings with prompts
Jivet I, Scheffel M, Drachsler H, Specht M (2017) Awareness is not enough: Pitfalls of learning analytics dashboards in the educational practice. In Data Driven Approaches in Digital Education: 12th European Conference on Technology Enhanced Learning, EC-TEL 2017, Tallinn, Estonia, September 12–15, 2017, Proceedings 12 (pp. 82–96). Springer International Publishing.
Jurafsky D, Martin JH (2023) Speech and language processing. An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition (3rd ed.).
Khajah M, Lindsey RV, Mozer MC (2016) How deep is knowledge tracing?. arXiv preprint arXiv:1604.02416.
Kim N, Patel R, Poliak A, Wang A, Xia P, McCoy RT, Tenney I, Ross A, Linzen T, Van Durme B, Bowman SR, Pavlick E (2019) Probing what different NLP tasks teach machines about function word comprehension
Kintsch W (1988) The role of knowledge in discourse comprehension: a construction-integration model. Psychol Rev 95(2):163. https://doi.org/10.1037/0033-295X.95.2.163
Article MATH Google Scholar
Larrabee Sønderlund A, Hughes E, Smith J (2019) The efficacy of learning analytics interventions in higher education: a systematic review. Br J Edu Technol 50(5):2594–2618. https://doi.org/10.1111/bjet.12720
Article MATH Google Scholar
Latifi S, Noroozi O, Talaee E (2021) Peer feedback or peer feedforward? Enhancing students’ argumentative peer learning processes and outcomes. Br J Edu Technol 52(2):768–784. https://doi.org/10.1111/bjet.13054
Article Google Scholar
Lee A, Lim TM (2016) Mining opinions from university students’ feedback using text analytics. Inf Technol Ind. https://doi.org/10.17762/itii.v4i1.40
Article MATH Google Scholar
Lim LA, Dawson S, Gašević D, Joksimović S, Fudge A, Pardo A, Gentili S (2020) Student sense-making of personalised feedback based on learning analytics. Australas J Educ Technol 36(6):15–33. https://doi.org/10.14742/ajet.6370
Article Google Scholar
Liu J, Xu Y, Zhu Y (2019) Automated essay scoring based on two-stage learning. arXiv preprint arXiv:1901.07744.
Logeswaran L, Lee H (2018). An efficient framework for learning sentence representations. arXiv preprint arXiv:1803.02893.
Long P, Siemens G (2011) Penetrating the Fog: Analytics in learning and education. EDUCAUSE Rev 22:31–40
MATH Google Scholar
Mangaroska K, Giannakos M (2019) Learning analytics for learning design: a systematic literature review of analytics-driven design to enhance learning. IEEE Trans Learn Technol 12(4):516–534
Article MATH Google Scholar
Manning CD, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University Press
Book MATH Google Scholar
Manovich (2013) Software Takes Command (A & C Black, Ed.)
Matcha W, Gašević D, Pardo A (2019) A systematic review of empirical studies on learning analytics dashboards: a self-regulated learning perspective. IEEE Trans Learn Technol 13(2):226–245. https://doi.org/10.1109/TLT.2019.2916802
Article Google Scholar
Mayer RE (2003) The promise of multimedia learning: using the same instructional design methods across different media. Learn Instr 13(2):125–139
Article MATH Google Scholar
Mayer RE (2009) Multimedia learning, 2nd edn. Cambridge University Press, New York
Book MATH Google Scholar
Mayer RE (2021) Evidence-based principles for how to design effective instructional videos. J Appl Res Mem Cogn 10(2):229–240. https://doi.org/10.1016/j.jarmac.2021.03.007
Article MATH Google Scholar
Mihalcea R, Corley C, Strapparava C (2006) Corpus-based and knowledge-based measures of text semantic similarity. Aaai, Washington
MATH Google Scholar
Mikolov T, Chen K, Corrado G, Dean J (2013a) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013b) Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems, 26.
Mohler M, Mihalcea R (2009). Text-to-text semantic similarity for automatic short answer grading. In Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009) (pp. 567–575).
Olive DM, Huynh DQ, Reynolds M, Dougiamas M, Wiese D (2019) A quest for a one-size-fits-all neural network: early prediction of students at risk in online courses. IEEE Trans Learn Technol 12(2):171–183. https://doi.org/10.1109/TLT.2019.2911068
Article Google Scholar
Onan A (2021) Sentiment analysis on product reviews based on weighted word embeddings and deep neural networks. Concurr Comput Pract Exp. https://doi.org/10.1002/cpe.5909
Article Google Scholar
Pardo A, Bartimote-Aufflick K, Buckingham Shum S, Dawson S, Gao J, Gašević D et al (2018) OnTask: delivering datainformed, personalized learning support actions. J Learn Anal 5(3):235–249. https://doi.org/10.18608/jla.2018.53.15
Article Google Scholar
Pardo A, Jovanovic J, Dawson S, Gašević D, Mirriahi N (2019) Using learning analytics to scale the provision of personalised feedback. Br J Edu Technol 50(1):128–138. https://doi.org/10.1111/bjet.12592
Article Google Scholar
Pavlik PI, Cen H, Koedinger KR (2009) Performance factors analysis-a new alternative to knowledge tracing. In: Proc. of Artificial Intelligence in Education, IOS Press, pp 531–538. https://doi.org/10.3233/978-1-60750-028-5-531.
Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1532–1543).
Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep Contextualized Word Representations. ArXiv, abs/1802.05365.
Pijeira-Díaz HJ, Braumann S, van de Pol J, van Gog T, de Bruin AB (2024) Towards adaptive support for self-regulated learning of causal relations: evaluating four Dutch word vector models. Br J Edu Technol. https://doi.org/10.1111/bjet.13431
Article Google Scholar
Pijeira-Díaz HJ, Subramanya S, van de Pol J, de Bruin A (2024) Evaluating sentence-BERT-powered learning analytics for automated assessment of students’ causal diagrams. J Comput Assist Learn. https://doi.org/10.1111/jcal.12992
Article Google Scholar
Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving language understanding by generative pre-training. Accessed: 2024–10–28.
Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M et al (2020) Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res 21(140):1–67
MathSciNet Google Scholar
Ramaswami G, Susnjak T, Mathrani A, Umer R (2023) Use of predictive analytics within learning analytics dashboards: a review of case studies. Technol Knowl Learn 28(3):959–980. https://doi.org/10.1007/s10758-022-09613-x
Article Google Scholar
Reich J, Tingley DH, Leder-Luis J, Roberts ME, Stewart B (2014) Computer-assisted reading and discovery for student generated text in massive open online courses. SSRN Elect J. https://doi.org/10.2139/ssrn.2499725
Article Google Scholar
Reimers N, Gurevych I (2019) Sentence-BERT: sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084.
Robinson C, Yeomans M, Reich J, Hulleman C, Gehlbach H (2016) Forecasting student achievement in MOOCs with natural language processing. In: Proceedings of the sixth international conference on learning analytics & knowledge - LAK ’16, 383–387. https://doi.org/10.1145/2883851.2883932
Sahu A, Bhowmick PK (2019) Feature engineering and ensemble-based approach for improving automatic short-answer grading performance. IEEE Trans Learn Technol 13(1):77–90. https://doi.org/10.1109/TLT.2019.2897997
Article MATH Google Scholar
Schulte D, Hamborg F, Akbik A (2024) Less is more: parameter-efficient selection of intermediate tasks for transfer learning. arXiv preprint arXiv:2410.15148
Shute VJ (2008) Focus on formative feedback. Rev Educ Res 78(1):153–189
Article MATH Google Scholar
Silvola A, Näykki P, Kaveri A, Muukkonen H (2021) Expectations for supporting student engagement with learning analytics: An academic path perspective. Comput Educ 168:104192
Article Google Scholar
Sultan MA, Salazar C, Sumner T (2016) Fast and easy short answer grading with high accuracy. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 1070–1075).
Teasley SD (2019) Learning analytics: Where information science and the learning sciences meet. Inf Learn Sci 120(1/2):59–73. https://doi.org/10.1108/ILS-06-2018-0045
Article MATH Google Scholar
Torfi A, Shirvani RA, Keneshloo Y, Tavaf N, Fox EA (2020) Natural language processing advancements by deep learning: A survey. arXiv preprint arXiv:2003.01200.
Ullmann TD (2019) Automated analysis of reflection in writing: Validating machine learning approaches. Int J Artif Intell Edu 29(2):217–257. https://doi.org/10.1007/s40593-019-00174-2
Article MATH Google Scholar
van Gog T (2014) The signaling (or cueing) principle in multimedia learning. In: Mayer RE (ed) The Cambridge handbook of multimedia learning, 2nd edn. Cambridge University Press, New York, pp 263–278
MATH Google Scholar
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Polosukhin I (2017) Attention is all you need. Advances in neural information processing systems, 30. 31st Conference on Neural Information Processing Systems. CA, USA: Long Beach.
Viberg O, Hatakka M, Bälter O, Mavroudi A (2018) The current landscape of learning analytics in higher education. Comput Hum Behav 89:98–110. https://doi.org/10.1016/j.chb.2018.07.027
Article Google Scholar
Wang X, Lin L, Han M, Spector JM (2020) Impacts of cues on learning: using eye-tracking technologies to examine the functions and designs of added cues in short instructional videos. Comput Hum Behav 107:106279. https://doi.org/10.1016/j.chb.2020.106279
Article Google Scholar
Weitekamp D, Harpstead E Koedinger KR (2020) An interaction design for machine teaching to develop AI tutors. In Proceedings of the 2020 CHI conference on human factors in computing systems (pp. 1–11).
Wise AF, Shaffer DW (2015) Why theory matters more than ever in the age of big data. J Learn Anal 2(2):5–13
Article MATH Google Scholar
Wisniewski B, Zierer K, Hattie J (2020) The power of feedback revisited: a meta-analysis of educational feedback research. Front Psychol 10:3087. https://doi.org/10.3389/fpsyg.2019.03087
Article MATH Google Scholar
Xing W, Du D (2019) Dropout prediction in MOOCs: using deep learning for personalized intervention. J Educ Comput Res 57(3):547–570. https://doi.org/10.1177/0735633118757015
Article MATH Google Scholar
Yang J, Han SC, Poon J (2022) A survey on extraction of causal relations from natural language text. Knowl Infor Syst 64(5):1161–1186
Article MATH Google Scholar
Yang L, Shami A (2020) On hyperparameter optimization of machine learning algorithms: theory and practice. Neurocomputing 415:295–316
Article MATH Google Scholar
Zhang L, Huang Y, Yang X, Yu S, Zhuang F (2022) An automatic short-answer grading model for semi-open-ended questions. Interact Learn Environ 30(1):177–190. https://doi.org/10.1080/10494820.2019.1648300
Article MATH Google Scholar

Download references

Acknowledgements

An earlier version of this work was presented at the 20th EARLI Conference, Thessaloniki, Greece. A grant from Greece and the European Union (European Social Fund—ESF) through the Operational Programme "Human Resources Development, Education and Lifelong Learning 2014-2020" in the context of the project “Planning, Development and Deployment of an Intelligent Feedback System Using Supervised and Unsupervised Machine Learning Methods” (MIS 5048955) provided the funding that initiated this line of work. We would like to thank all students who participated in the study. Lastly, we are grateful to the four anonymous reviewers, whose helpful comments and suggestions led to a substantial improvement of the initial manuscript.

Author information

Authors and Affiliations

Department of Early Childhood Education, University of Thessaly, Volos, Greece
C. Papadimas, V. Ragazou & I. Karasavvidis
Department of Primary Education, University of Thessaly, Volos, Greece
V. Kollias

Authors

C. Papadimas
View author publications
You can also search for this author in PubMed Google Scholar
V. Ragazou
View author publications
You can also search for this author in PubMed Google Scholar
I. Karasavvidis
View author publications
You can also search for this author in PubMed Google Scholar
V. Kollias
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

CP: helped in data collection, analysis and review, VR: was involved in data collection, review, editing, IK: contributed to conceptualization, analysis, writing, validation, review, VK: writing, review. All authors read and approved the final manuscript.

Corresponding author

Correspondence to V. Kollias.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Papadimas, C., Ragazou, V., Karasavvidis, I. et al. Predicting learning performance using NLP: an exploratory study using two semantic textual similarity methods. Knowl Inf Syst (2025). https://doi.org/10.1007/s10115-024-02293-2

Download citation

Received: 10 November 2023
Revised: 30 October 2024
Accepted: 26 November 2024
Published: 13 February 2025
DOI: https://doi.org/10.1007/s10115-024-02293-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Predicting learning performance using NLP: an exploratory study using two semantic textual similarity methods

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Student Performance Prediction Model Based on Course Description and Student Similarity

Similarity Based Answer Evaluation in Academic Questions Using Natural Language Processing Techniques

Predicting Course Performance on a Massive Open Online Course Platform: A Natural Language Processing Approach

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Predicting learning performance using NLP: an exploratory study using two semantic textual similarity methods

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Student Performance Prediction Model Based on Course Description and Student Similarity

Similarity Based Answer Evaluation in Academic Questions Using Natural Language Processing Techniques

Predicting Course Performance on a Massive Open Online Course Platform: A Natural Language Processing Approach

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation