Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Advertisement

Predicting learning performance using NLP: an exploratory study using two semantic textual similarity methods

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Most learning analytics (LA) systems provide generic feedback, because they primarily draw on performance data based on quiz scores. This study explored the potential of student-generated summaries as an alternative method for predicting learning performance. Two hundred and fifty-four undergraduates first watched a series of six short video lectures and then wrote a short summary for each one. Based on their median performance quiz scores, the participants were divided into two performance groups. Sparse and dense text vectorization methods were used to represent the video lectures and student summaries. Three semantic textual similarity features were computed using cosine similarity and were used as input into seven common machine learning algorithms. The results indicated that the sparse similarity features outperformed dense ones in classifying performance. Also, the best classification accuracy was achieved using the K-Nearest Neighbors and Random Forrest algorithms. Overall, the findings suggest that semantic similarity measures can be used as additional proxy measures of learning, thereby enabling the real-time monitoring and evaluation of student understanding in LA contexts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Agarwal SR, Agrawal SB, Latif AM (2015) Sentence formation in NLP engine on the basis of indian sign language using hand gestures. Int J Comput Appl 116(17)

  2. Aggarwal C, Zhai C (2012) Mining text data. Springer Science & Business Media, Berlin

    Book  MATH  Google Scholar 

  3. Akçapınar G, Altun A, Aşkar P (2019) Using learning analytics to develop early-warning system for at-risk students. Int J Educ Technol High Educ 16(1):1–20. https://doi.org/10.1186/s41239-019-0172-z

    Article  MATH  Google Scholar 

  4. Albano V, Firmani D, Laura L, Mathew JG, Paoletti AL, Torrente I (2023) NLP-based management of large multiple-choice test item repositories. J Learn Anal 10(3):28–44. https://doi.org/10.18608/jla.2023.7897

    Article  MATH  Google Scholar 

  5. Almatrafi O, Johri A (2018) Systematic review of discussion forums in massive open online courses (MOOCs). IEEE Trans Learn Technol 12(3):413–428

    Article  MATH  Google Scholar 

  6. Arnold KE, Pistilli MD (2012) Course signals at Purdue: using learning analytics to increase student success. In Proceedings of the 2nd International Conference on Learning Analytics and Knowledge, ACM, New York, NY, pp. 267–270.

  7. André M, Mello RF, Nascimento A, Lins RD, Gašević D (2021) Toward automatic classification of online discussion messages for social presence. IEEE Trans Learn Technol 14(6):802–816

    Article  Google Scholar 

  8. Asif R, Merceron A, Ali SA, Haider NG (2017) Analyzing undergraduate students’ performance using educational data mining. Comput & Educ 113:177–194

    Article  MATH  Google Scholar 

  9. Baker RS, Gašević D, Karumbaiah S (2021) Four paradigms in learning analytics: why paradigm convergence matters. Comput Educ: Artif Intell 2:100021. https://doi.org/10.1016/j.caeai.2021.100021

    Article  Google Scholar 

  10. Bakharia A (2016) Towards cross-domain MOOC forum post classification. In: Proceedings of the third (2016) ACM conference on learning @ scale, pp 253–256. https://doi.org/10.1145/2876034.2893427

  11. Bangert-Drowns RL, Hurley MM, Wilkinson B (2004) The effects of school-based writing-to-learn interventions on academic achievement: a meta-analysis. Rev Educ Res 74(1):29–58. https://doi.org/10.3102/00346543074001029

    Article  MATH  Google Scholar 

  12. Banihashem SK, Noroozi O, van Ginkel S, Macfadyen LP, Biemans HJ (2022) A systematic review of the role of learning analytics in enhancing feedback practices in higher education. Educ Res Rev. https://doi.org/10.1016/j.edurev.2022.100489

    Article  MATH  Google Scholar 

  13. Beseiso M, Alzahrani S (2020) An empirical analysis of BERT embedding for automated essay scoring. Int J Advan Comput Sci Appl. https://doi.org/10.14569/ijacsa.2020.0111027

    Article  MATH  Google Scholar 

  14. Blake C (2011) Text mining. Ann Rev Inf Sci Technol 45(1):121–155. https://doi.org/10.1002/aris.2011.1440450110

    Article  MATH  Google Scholar 

  15. Bodily R, Verbert K (2017) Trends and issues in student-facing learning analytics reporting systems research. In Proceedings of the seventh international learning analytics & knowledge conference (pp. 309–318). https://doi.org/10.1145/3027385.3027403.

  16. Cai C (2019) Automatic essay scoring with recurrent neural network. In Proceedings of the 3rd International Conference on High Performance Compilation, Computing and Communications (pp. 1–7). New York, NY, USA. https://doi.org/10.1145/3318265.3318296.

  17. Carmon CM, Hu X, Graesser AC (2023) Assessment in Conversational Intelligent Tutoring Systems: Are Contextual Embeddings Really Better? In: Wang N, Rebolledo-Mendez G, Dimitrova V, Matsuda N, Santos O (eds) Artificial Intelligence in Education. Posters and Late Breaking Results, Workshops and Tutorials, Industry and Innovation Tracks, Practitioners, Doctoral Consortium and Blue Sky. AIED 2023. Communications in Computer and Information Science, vol 1831. Springer, Cham. https://doi.org/10.1007/978-3-031-36336-8_19

    Chapter  Google Scholar 

  18. Caspari-Sadeghi S (2023) Learning assessment in the age of big data: learning analytics in higher education. Cogent Educ 10(1):2162697. https://doi.org/10.1080/2331186X.2022.2162697

    Article  Google Scholar 

  19. Chen CM, Wang JY, Hsu LC (2021) An interactive test dashboard with diagnosis and feedback mechanisms to facilitate learning performance. Comput Educ: Artif Intell 2:100015. https://doi.org/10.1016/j.caeai.2021.100015

    Article  Google Scholar 

  20. Chi MT, Wylie R (2014) The ICAP framework: linking cognitive engagement to active learning outcomes. Educ Psychol 49(4):219–243. https://doi.org/10.1080/00461520.2014.965823

    Article  MATH  Google Scholar 

  21. Chimingyang H (2020). An automatic system for essay questions scoring based on LSTM and word embedding. In 2020 5th International Conference on Information Science, Computer Technology and Transportation (ISCTT) (pp. 355–364). IEEE. https://doi.org/10.1109/ISCTT51595.2020.00068.

  22. Cichosz P (2018) A case study in text mining of discussion forum posts: Classification with bag of words and global vectors. Int J Appl Math Comput Sci 28(4):787–801. https://doi.org/10.2478/amcs-2018-0060

    Article  MATH  Google Scholar 

  23. Chui KT, Fung DCL, Lytras MD, Lam TM (2020) Predicting at-risk university students in a virtual learning environment via a machine learning algorithm. Comput Hum Behav 107:105584. https://doi.org/10.1016/j.chb.2018.06.032

    Article  Google Scholar 

  24. Clow D (2012) The learning analytics cycle: closing the loop effectively. In Proceedings of the 2nd international conference on learning analytics and knowledge (pp. 134–138). https://doi.org/10.1145/2330601.2330636.

  25. Condor A, Litster M, Pardos Z (2021) Automatic short answer grading with sbert on out-of-sample questions. Int Educ Data Mining Soc. https://educationaldatamining.org/edm2021.

  26. Conneau A, Kiela D, Schwenk H, Barrault L, Bordes A (2017) Supervised learning of universal sentence representations from natural language inference data. arXiv preprint arXiv:1705.02364.

  27. Corbett AT, Anderson JR (1994) Knowledge tracing: modeling the acquisition of procedural knowledge. User Model User-Adap Inter 4(4):253–327. https://doi.org/10.1007/BF01099821

    Article  MATH  Google Scholar 

  28. Crossley S, Paquette L, Dascalu M, McNamara DS, Baker RS (2016) Combining click-stream data with NLP tools to better understand MOOC completion. In: Proceedings of the sixth international conference on learning analytics & knowledge, pp 6–14. https://doi.org/10.1145/2883851.2883931

  29. Daniel BK (2014) Big data and analytics in higher education: opportunities and challenges. Br J Edu Technol 46(5):904–920. https://doi.org/10.1111/bjet.12230

    Article  MATH  Google Scholar 

  30. Del Gobbo E, Guarino A, Cafarelli B, Grilli L (2023) GradeAid: a framework for automatic short answers grading in educational contexts—design, implementation and evaluation. Knowl Inf Sys 65(10):4295–4334. https://doi.org/10.1007/s10115-023-01892-9

    Article  Google Scholar 

  31. Dessì D, Fenu G, Marras M, Recupero DR (2019) Bridging learning analytics and cognitive computing for big data classification in micro-learning video collections. Comput Hum Behav 92:468–477. https://doi.org/10.1016/j.chb.2018.03.004

    Article  MATH  Google Scholar 

  32. Devlin J, Chang MW, Lee K, Toutanova K (2018) BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.

  33. El Aouifi H, El Hajji M, Es-Saady Y, Douzi H (2021) Predicting learner’s performance through video sequences viewing behavior analysis using educational data-mining. Educ Inf Technol 26(5):5799–5814. https://doi.org/10.1007/s10639-021-10512-4

    Article  Google Scholar 

  34. Elouazizi N (2014) Point-of-view mining and cognitive presence in MOOCs: a (computational) linguistics perspective. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 32–37

  35. Explosion (2022) spaCy (3.0). https://spacy.io/

  36. Fahd K, Venkatraman S, Miah SJ, Ahmed K (2022) Application of machine learning in higher education to assess student academic performance, at-risk, and attrition: A meta-analysis of literature. Educ Inf Technol. https://doi.org/10.1007/s10639-021-10741-7

    Article  MATH  Google Scholar 

  37. Ferguson R, Clow D (2017) Where is the evidence? A call to action for learning analytics. In Proceedings of the seventh international learning analytics & knowledge conference (pp. 56–65).

  38. Fernández-Delgado M, Cernadas E, Barro S, Amorim D (2014) Do we need hundreds of classifiers to solve real world classification problems? J Mach Learn Res 15(1):3133–3181

    MathSciNet  MATH  Google Scholar 

  39. Ferreira-Mello R, André M, Pinheiro A, Costa E, Romero C (2019) Text mining in education. Wiley Interdiscip Rev: Data Min Knowl Discov 9(6):e1332. https://doi.org/10.1002/widm.1332

    Article  Google Scholar 

  40. Fiorella L, Stull AT, Kuhlmann S, Mayer RE (2020) Fostering generative learning from video lessons: benefits of instructor-generated drawings and learner-generated explanations. J Educ Psychol 112(5):895

    Article  Google Scholar 

  41. Fischer C, Pardos ZA, Baker RS, Williams JJ, Smyth P, Yu R, Warschauer M (2020) Mining big data in education: affordances and challenges. Review of Research in Education 44(1):130–160. https://doi.org/10.3102/0091732X20903304

    Article  Google Scholar 

  42. Gaddipati SK, Nair D, Plöger PG (2020) Comparative evaluation of pretrained transfer learning models on automatic short answer grading. arXiv preprint arXiv:2009.01303.

  43. Gašević D, Dawson S, Siemens G (2015) Let’s not forget: learning analytics are about learning. TechTrends 59:64–71. https://doi.org/10.1007/s11528-014-0822-x

    Article  Google Scholar 

  44. Gašević D, Dawson S, Rogers T, Gasevic D (2016) Learning analytics should not promote one size fits all: the effects of instructional conditions in predicting academic success. Internet Higher Educ 28:68–84. https://doi.org/10.1016/j.iheduc.2015.10.002

    Article  Google Scholar 

  45. Gomaa WH, Fahmy AA (2020) Ans2vec: A Scoring System for Short Answers. In: Hassanien A, Azar A, Gaber T, Bhatnagar R, Tolba M (eds) The International Conference on Advanced Machine Learning Technologies and Applications (AMLTA2019). AMLTA 2019. Advances in Intelligent Systems and Computing, vol 921. Springer, Cham. https://doi.org/10.1007/978-3-030-14118-9_59

    Chapter  MATH  Google Scholar 

  46. Graesser AC (2013) Prose comprehension beyond the word. Springer Science & Business Media, Berlin

    MATH  Google Scholar 

  47. Graham S, Harris KR, Santangelo T (2015) Research-based writing practices and the common core: meta-analysis and meta-synthesis. Elem Sch J 115(4):498–522. https://doi.org/10.1086/681964

    Article  MATH  Google Scholar 

  48. Graham S, Kiuhara SA, MacKay M (2020) The effects of writing on learning in science, social studies, and mathematics: a meta-analysis. Rev Educ Res 90(2):179–226. https://doi.org/10.3102/0034654320914744

    Article  MATH  Google Scholar 

  49. Guillot R, Seanosky J, Guillot I, Boulanger D, Guillot C, Kumar V, Fraser SN(2018) Assessing learning analytics systems impact by summative measures. In 2018 IEEE 18th International Conference on Advanced Learning Technologies (ICALT) (pp. 188–190). IEEE.

  50. Guzmán-Valenzuela C, Gómez-González C, Rojas-Murphy Tagle A, Lorca-Vyhmeister A (2021) Learning analytics in higher education: a preponderance of analytics but very little learning? Int J Educ Technol High Educ 18:1–19. https://doi.org/10.1186/s41239-021-00258-x

    Article  Google Scholar 

  51. Hasnine MN, Akcapinar G, Flanagan B, Majumdar R, Mouri K, Ogata H (2018). Towards final scores prediction over clickstream using machine learning methods. In 26th International Conference on Computers in Education Workshop Proceedings (pp. 399–404). Asia-Pacific Society for Computers in Education (APSCE).

  52. Hassan S, Fahmy AA, El-Ramly M (2018) Automatic short answer scoring based on paragraph embeddings. Int J Adv Comput Sci Appl 9(10):397–402

    MATH  Google Scholar 

  53. Hattie J, Timperley H (2007) The power of feedback. Rev Educ Res 77(1):81–112. https://doi.org/10.3102/003465430298487

    Article  MATH  Google Scholar 

  54. Hayati H, Chanaa A, Idrissi MK, Bennani S (2019) Doc2Vec & naïve bayes: learners' cognitive presence assessment through asynchronous online discussion TQ transcripts. Int J Emerg Technol Learn 14(8)

  55. Hernández-Lara AB, Perera-Lluna A, Serradell-López E (2021) Game learning analytics of instant messaging and online discussion forums in higher education. Educ+ Train 63(9):1288–1308. https://doi.org/10.1108/ET-11-2020-0334

    Article  Google Scholar 

  56. Hirschberg J, Manning CD (2015) Advances in natural language processing. Science 349(6245):261–266. https://doi.org/10.1126/science.aaa8685

    Article  MathSciNet  MATH  Google Scholar 

  57. Honnibal M, Montani I (2017) spaCy 2: Natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing. 7(1):411–420 (To appear)

  58. Jarodzka H, Van Gog T, Dorr M, Scheiter K, Gerjets P (2013) Learning to see: guiding students’ attention via a model’s eye movements fosters learning. Learn Instr 25:62–70. https://doi.org/10.1016/j.learninstruc.2012.11.004

    Article  Google Scholar 

  59. Jiang T, Jiao J, Huang S, Zhang Z, Wang D, Zhuang F, Wei F, Huang H, Deng D, Zhang Q (2022) PromptBERT: Improving BERT sentence embeddings with prompts

  60. Jivet I, Scheffel M, Drachsler H, Specht M (2017) Awareness is not enough: Pitfalls of learning analytics dashboards in the educational practice. In Data Driven Approaches in Digital Education: 12th European Conference on Technology Enhanced Learning, EC-TEL 2017, Tallinn, Estonia, September 12–15, 2017, Proceedings 12 (pp. 82–96). Springer International Publishing.

  61. Jurafsky D, Martin JH (2023) Speech and language processing. An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition (3rd ed.).

  62. Khajah M, Lindsey RV, Mozer MC (2016) How deep is knowledge tracing?. arXiv preprint arXiv:1604.02416.

  63. Kim N, Patel R, Poliak A, Wang A, Xia P, McCoy RT, Tenney I, Ross A, Linzen T, Van Durme B, Bowman SR, Pavlick E (2019) Probing what different NLP tasks teach machines about function word comprehension

  64. Kintsch W (1988) The role of knowledge in discourse comprehension: a construction-integration model. Psychol Rev 95(2):163. https://doi.org/10.1037/0033-295X.95.2.163

    Article  MATH  Google Scholar 

  65. Larrabee Sønderlund A, Hughes E, Smith J (2019) The efficacy of learning analytics interventions in higher education: a systematic review. Br J Edu Technol 50(5):2594–2618. https://doi.org/10.1111/bjet.12720

    Article  MATH  Google Scholar 

  66. Latifi S, Noroozi O, Talaee E (2021) Peer feedback or peer feedforward? Enhancing students’ argumentative peer learning processes and outcomes. Br J Edu Technol 52(2):768–784. https://doi.org/10.1111/bjet.13054

    Article  Google Scholar 

  67. Lee A, Lim TM (2016) Mining opinions from university students’ feedback using text analytics. Inf Technol Ind. https://doi.org/10.17762/itii.v4i1.40

    Article  MATH  Google Scholar 

  68. Lim LA, Dawson S, Gašević D, Joksimović S, Fudge A, Pardo A, Gentili S (2020) Student sense-making of personalised feedback based on learning analytics. Australas J Educ Technol 36(6):15–33. https://doi.org/10.14742/ajet.6370

    Article  Google Scholar 

  69. Liu J, Xu Y, Zhu Y (2019) Automated essay scoring based on two-stage learning. arXiv preprint arXiv:1901.07744.

  70. Logeswaran L, Lee H (2018). An efficient framework for learning sentence representations. arXiv preprint arXiv:1803.02893.

  71. Long P, Siemens G (2011) Penetrating the Fog: Analytics in learning and education. EDUCAUSE Rev 22:31–40

    MATH  Google Scholar 

  72. Mangaroska K, Giannakos M (2019) Learning analytics for learning design: a systematic literature review of analytics-driven design to enhance learning. IEEE Trans Learn Technol 12(4):516–534

    Article  MATH  Google Scholar 

  73. Manning CD, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University Press

    Book  MATH  Google Scholar 

  74. Manovich (2013) Software Takes Command (A & C Black, Ed.)

  75. Matcha W, Gašević D, Pardo A (2019) A systematic review of empirical studies on learning analytics dashboards: a self-regulated learning perspective. IEEE Trans Learn Technol 13(2):226–245. https://doi.org/10.1109/TLT.2019.2916802

    Article  Google Scholar 

  76. Mayer RE (2003) The promise of multimedia learning: using the same instructional design methods across different media. Learn Instr 13(2):125–139

    Article  MATH  Google Scholar 

  77. Mayer RE (2009) Multimedia learning, 2nd edn. Cambridge University Press, New York

    Book  MATH  Google Scholar 

  78. Mayer RE (2021) Evidence-based principles for how to design effective instructional videos. J Appl Res Mem Cogn 10(2):229–240. https://doi.org/10.1016/j.jarmac.2021.03.007

    Article  MATH  Google Scholar 

  79. Mihalcea R, Corley C, Strapparava C (2006) Corpus-based and knowledge-based measures of text semantic similarity. Aaai, Washington

    MATH  Google Scholar 

  80. Mikolov T, Chen K, Corrado G, Dean J (2013a) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.

  81. Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013b) Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems, 26.

  82. Mohler M, Mihalcea R (2009). Text-to-text semantic similarity for automatic short answer grading. In Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009) (pp. 567–575).

  83. Olive DM, Huynh DQ, Reynolds M, Dougiamas M, Wiese D (2019) A quest for a one-size-fits-all neural network: early prediction of students at risk in online courses. IEEE Trans Learn Technol 12(2):171–183. https://doi.org/10.1109/TLT.2019.2911068

    Article  Google Scholar 

  84. Onan A (2021) Sentiment analysis on product reviews based on weighted word embeddings and deep neural networks. Concurr Comput Pract Exp. https://doi.org/10.1002/cpe.5909

    Article  Google Scholar 

  85. Pardo A, Bartimote-Aufflick K, Buckingham Shum S, Dawson S, Gao J, Gašević D et al (2018) OnTask: delivering datainformed, personalized learning support actions. J Learn Anal 5(3):235–249. https://doi.org/10.18608/jla.2018.53.15

    Article  Google Scholar 

  86. Pardo A, Jovanovic J, Dawson S, Gašević D, Mirriahi N (2019) Using learning analytics to scale the provision of personalised feedback. Br J Edu Technol 50(1):128–138. https://doi.org/10.1111/bjet.12592

    Article  Google Scholar 

  87. Pavlik PI, Cen H, Koedinger KR (2009) Performance factors analysis-a new alternative to knowledge tracing. In: Proc. of Artificial Intelligence in Education, IOS Press, pp 531–538. https://doi.org/10.3233/978-1-60750-028-5-531.

  88. Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1532–1543).

  89. Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep Contextualized Word Representations. ArXiv, abs/1802.05365.

  90. Pijeira-Díaz HJ, Braumann S, van de Pol J, van Gog T, de Bruin AB (2024) Towards adaptive support for self-regulated learning of causal relations: evaluating four Dutch word vector models. Br J Edu Technol. https://doi.org/10.1111/bjet.13431

    Article  Google Scholar 

  91. Pijeira-Díaz HJ, Subramanya S, van de Pol J, de Bruin A (2024) Evaluating sentence-BERT-powered learning analytics for automated assessment of students’ causal diagrams. J Comput Assist Learn. https://doi.org/10.1111/jcal.12992

    Article  Google Scholar 

  92. Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving language understanding by generative pre-training. Accessed: 2024–10–28.

  93. Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M et al (2020) Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res 21(140):1–67

    MathSciNet  Google Scholar 

  94. Ramaswami G, Susnjak T, Mathrani A, Umer R (2023) Use of predictive analytics within learning analytics dashboards: a review of case studies. Technol Knowl Learn 28(3):959–980. https://doi.org/10.1007/s10758-022-09613-x

    Article  Google Scholar 

  95. Reich J, Tingley DH, Leder-Luis J, Roberts ME, Stewart B (2014) Computer-assisted reading and discovery for student generated text in massive open online courses. SSRN Elect J. https://doi.org/10.2139/ssrn.2499725

    Article  Google Scholar 

  96. Reimers N, Gurevych I (2019) Sentence-BERT: sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084.

  97. Robinson C, Yeomans M, Reich J, Hulleman C, Gehlbach H (2016) Forecasting student achievement in MOOCs with natural language processing. In: Proceedings of the sixth international conference on learning analytics & knowledge - LAK ’16, 383–387. https://doi.org/10.1145/2883851.2883932

  98. Sahu A, Bhowmick PK (2019) Feature engineering and ensemble-based approach for improving automatic short-answer grading performance. IEEE Trans Learn Technol 13(1):77–90. https://doi.org/10.1109/TLT.2019.2897997

    Article  MATH  Google Scholar 

  99. Schulte D, Hamborg F, Akbik A (2024) Less is more: parameter-efficient selection of intermediate tasks for transfer learning. arXiv preprint arXiv:2410.15148

  100. Shute VJ (2008) Focus on formative feedback. Rev Educ Res 78(1):153–189

    Article  MATH  Google Scholar 

  101. Silvola A, Näykki P, Kaveri A, Muukkonen H (2021) Expectations for supporting student engagement with learning analytics: An academic path perspective. Comput Educ 168:104192

    Article  Google Scholar 

  102. Sultan MA, Salazar C, Sumner T (2016) Fast and easy short answer grading with high accuracy. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 1070–1075).

  103. Teasley SD (2019) Learning analytics: Where information science and the learning sciences meet. Inf Learn Sci 120(1/2):59–73. https://doi.org/10.1108/ILS-06-2018-0045

    Article  MATH  Google Scholar 

  104. Torfi A, Shirvani RA, Keneshloo Y, Tavaf N, Fox EA (2020) Natural language processing advancements by deep learning: A survey. arXiv preprint arXiv:2003.01200.

  105. Ullmann TD (2019) Automated analysis of reflection in writing: Validating machine learning approaches. Int J Artif Intell Edu 29(2):217–257. https://doi.org/10.1007/s40593-019-00174-2

    Article  MATH  Google Scholar 

  106. van Gog T (2014) The signaling (or cueing) principle in multimedia learning. In: Mayer RE (ed) The Cambridge handbook of multimedia learning, 2nd edn. Cambridge University Press, New York, pp 263–278

    MATH  Google Scholar 

  107. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Polosukhin I (2017) Attention is all you need. Advances in neural information processing systems, 30. 31st Conference on Neural Information Processing Systems. CA, USA: Long Beach.

  108. Viberg O, Hatakka M, Bälter O, Mavroudi A (2018) The current landscape of learning analytics in higher education. Comput Hum Behav 89:98–110. https://doi.org/10.1016/j.chb.2018.07.027

    Article  Google Scholar 

  109. Wang X, Lin L, Han M, Spector JM (2020) Impacts of cues on learning: using eye-tracking technologies to examine the functions and designs of added cues in short instructional videos. Comput Hum Behav 107:106279. https://doi.org/10.1016/j.chb.2020.106279

    Article  Google Scholar 

  110. Weitekamp D, Harpstead E Koedinger KR (2020) An interaction design for machine teaching to develop AI tutors. In Proceedings of the 2020 CHI conference on human factors in computing systems (pp. 1–11).

  111. Wise AF, Shaffer DW (2015) Why theory matters more than ever in the age of big data. J Learn Anal 2(2):5–13

    Article  MATH  Google Scholar 

  112. Wisniewski B, Zierer K, Hattie J (2020) The power of feedback revisited: a meta-analysis of educational feedback research. Front Psychol 10:3087. https://doi.org/10.3389/fpsyg.2019.03087

    Article  MATH  Google Scholar 

  113. Xing W, Du D (2019) Dropout prediction in MOOCs: using deep learning for personalized intervention. J Educ Comput Res 57(3):547–570. https://doi.org/10.1177/0735633118757015

    Article  MATH  Google Scholar 

  114. Yang J, Han SC, Poon J (2022) A survey on extraction of causal relations from natural language text. Knowl Infor Syst 64(5):1161–1186

    Article  MATH  Google Scholar 

  115. Yang L, Shami A (2020) On hyperparameter optimization of machine learning algorithms: theory and practice. Neurocomputing 415:295–316

    Article  MATH  Google Scholar 

  116. Zhang L, Huang Y, Yang X, Yu S, Zhuang F (2022) An automatic short-answer grading model for semi-open-ended questions. Interact Learn Environ 30(1):177–190. https://doi.org/10.1080/10494820.2019.1648300

    Article  MATH  Google Scholar 

Download references

Acknowledgements

An earlier version of this work was presented at the 20th EARLI Conference, Thessaloniki, Greece. A grant from Greece and the European Union (European Social Fund—ESF) through the Operational Programme "Human Resources Development, Education and Lifelong Learning 2014-2020" in the context of the project “Planning, Development and Deployment of an Intelligent Feedback System Using Supervised and Unsupervised Machine Learning Methods” (MIS 5048955) provided the funding that initiated this line of work. We would like to thank all students who participated in the study. Lastly, we are grateful to the four anonymous reviewers, whose helpful comments and suggestions led to a substantial improvement of the initial manuscript.

Author information

Authors and Affiliations

Authors

Contributions

CP: helped in data collection, analysis and review, VR: was involved in data collection, review, editing, IK: contributed to conceptualization, analysis, writing, validation, review, VK: writing, review. All authors read and approved the final manuscript.

Corresponding author

Correspondence to V. Kollias.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Papadimas, C., Ragazou, V., Karasavvidis, I. et al. Predicting learning performance using NLP: an exploratory study using two semantic textual similarity methods. Knowl Inf Syst (2025). https://doi.org/10.1007/s10115-024-02293-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10115-024-02293-2

Keywords