Abstract
This research investigates the ability of semantic text models to assess student responses during tutoring compared with expert human judges. Recent interest in text similarity has led to a proliferation of models that can potentially be used for assessing student responses; however, whether these models perform as well as traditional distributional semantic models like Latent Semantic Analysis for student response assessment in automatic short answer grading is unclear. We assessed 5166 response pairings of 219 participants across 118 electronics questions and scored each with 13 different computational text models, including models that use regular expressions, distributional semantics, word embeddings, contextual embeddings, and combinations of these features. We show a few semantic text models performing comparably to Latent Semantic Analysis, and in some cases outperforming the model. Furthermore, combination models outperformed individual models in agreement with human judges. Choosing appropriate computational techniques and optimizing the text model may continue to improve the accuracy, recall, weighted agreement and therefore, the effectiveness of conversational ITSs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bowman, S.R., Angeli, G., Potts, C., Manning, C.D.: A large annotated corpus for learning natural language inference. arXiv preprint arXiv:1508.05326 (2015)
Brown, T., et al.: Language models are few-shot learners. In: Advances in Neural Information Processing Systems, vol. 33, pp. 1877–1901 (2020)
Burrows, S., Gurevych, I., Stein, B.: The eras and trends of automatic short answer grading. Int. J. Artif. Intell. Educ. 25(1), 60–117 (2014). https://doi.org/10.1007/s40593-014-0026-8
Carmon, C.M.: Semantic matching evaluation: optimizing models for agreement between humans and AutoTutor. Electronic Theses and Dissertations. 2148 (2021). https://digitalcommons.memphis.edu/etd/2148
Carmon, C.M., Hampton, A.J., Morgan, B., Cai, Z., Wang, L., Graesser, A.C.: Semantic matching evaluation of user responses to electronics questions in AutoTutor. In: Sixth (2019) ACM Conference on Learning @ Scale. ACM, Chicago (2019). https://doi.org/10.1145/3330430.3333649
Carmon, C., Morgan, B., Hampton, A.J., Cai, Z., Graesser, A.C.: Semantic matching evaluation in ElectronixTutor. In: Proceedings of the 11th International Conference on Educational Data Mining, Buffalo, NY, pp. 580–583 (2018)
Condor, A., Litster, M., Pardos, Z.A.: Automatic short answer grading with SBERT on out-of-sample questions. In: Hsiao, S., Sahebi, S. (eds.) Proceedings of the 14th International Conference on Educational Data Mining, pp. 345–352. EDM (2021)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Dumais, S.T., Furnas, G.W., Landauer, T.K., Deerwester, S., Harshman, R.: Using latent semantic analysis to improve access to textual information. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 1988, pp. 281–285. ACM Inc., New York, NY, USA. https://doi.org/10.1145/57167.57214 (1988)
Evens, M.W., et al.: CIRCSIM-tutor: an intelligent tutoring system using natural language dialogue. In: Proceedings of the 12th Midwest Artificial Intelligence and Cognitive Science Conference, Oxford, pp. 16–23 (2001)
Furnas, G.W., Landauer, T.K., Gomez, U.M., Dumais, S.T.: Statistical semantics: analysis of the potential performance of key-word information systems. Bell Syst. Tech. J. 62(6), 1753–1806 (1983)
Graesser, A.C.: Conversations with AutoTutor help students learn. Int. J. Artif. Intell. Educ. 26, 124–132 (2016)
Graesser, A.C.: Learning science principles and technologies with agents that promote deep learning. In: Learning Science: Theory, Research, and Practice, pp. 2–33. McGraw-Hill, New York (2020)
Graesser, A.C., et al.: ElectronixTutor: an intelligent tutoring system with multiple learning resources for electronics. Int. J. STEM Educ. Innov. Res. 5(1) (2018). https://doi.org/10.1186/s40594-017-0072-5
Graesser, A.C., et al.: AutoTutor: a tutor with dialogue in natural language. Behav. Res. Methods Instrum. Comput. 36, 180–193 (2004)
Hill, F., Cho, K., Korhonen, A.: Learning distributed representations of sentences from unlabelled data. arXiv preprint arXiv:1602.03483 (2016)
Jurafsky, D., Martin, J.: Speech and Language Processing. Prentice Hall, Englewood (2008)
Landauer, T.K., McNamara, D.S., Dennis, S., Kintsch, W.: Handbook of Latent Semantic Analysis. Erlbaum, Mahwah, NJ (2007)
Mikolov, T., Chen, K., Corrado, G.S., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings of Workshop at ICLR (2013)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, vol. 26 (2013)
Muennighoff, N.: SGPT: GPT sentence embeddings for semantic search. arXiv preprint arXiv:2202.08904 (2022)
Nazaretsky, T., Hershkovitz, S., Alexandron, G.: Kappa learning: a new item-similarity method for clustering educational items from response data. In: Proceedings of the 12th International Conference on Educational Data Mining, EDM 2019, pp. 129–138 (2019)
Nye, B.D., Core, M., Swartout, B., Hu, X., Morgan, B., Graesser, A.: ElectronixTutor content and system testing to support adaptive learning for nuclear field electronics. Final report (2022)
Nye, B.D., Graesser, A.C., Hu, X.: AutoTutor and family: a review of 17 years of natural language tutoring. Int. J. Artif. Intell. Educ. 24(4), 427–469 (2014)
Olney, A.M., et al.: Guru: a computer tutor that models expert human tutors. In: Cerri, S.A., Clancey, W.J., Papadourakis, G., Panourgia, K. (eds.) ITS 2012. LNCS, vol. 7315, pp. 256–261. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-30950-2_32
Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using siamese BERT-networks. arXiv preprint arXiv:1908.10084 (2019)
VanLehn, K., Graesser, A.C., Jackson, G.T., Jordan, P., Olney, A., Rose, C.P.: When are tutorial dialogues more effective than reading? Cogn. Sci. 31, 3–62 (2007)
VanLehn, K., et al.: The architecture of Why2-Atlas: a coach for qualitative physics essay writing. In: Cerri, S.A., Gouardères, G., Paraguaçu, F. (eds.) Intelligent Tutoring Systems, vol. 2363, pp. 158–167. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-47987-2_20
Williams, A., Nangia, N., Bowman, S.R.: A broad-coverage challenge corpus for sentence understanding through inference. arXiv preprint arXiv:1704.05426 (2017)
Zhang, Y., He, R., Liu, Z., Lim, K.H., Bing, L.: An unsupervised sentence embedding method by mutual information maximization. arXiv preprint arXiv:2009.12061 (2020)
Acknowledgments
This research was supported by the Office of Naval Research (N00014-00-1-0600, N00014-15-P-1184; N00014-12-C-0643; N00014-16-C-3027) and the National Science Foundation Data Infrastructure Building Blocks program (ACI-1443068). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of ONR or NSF.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Carmon, C.M., Hu, X., Graesser, A.C. (2023). Assessment in Conversational Intelligent Tutoring Systems: Are Contextual Embeddings Really Better?. In: Wang, N., Rebolledo-Mendez, G., Dimitrova, V., Matsuda, N., Santos, O.C. (eds) Artificial Intelligence in Education. Posters and Late Breaking Results, Workshops and Tutorials, Industry and Innovation Tracks, Practitioners, Doctoral Consortium and Blue Sky. AIED 2023. Communications in Computer and Information Science, vol 1831. Springer, Cham. https://doi.org/10.1007/978-3-031-36336-8_19
Download citation
DOI: https://doi.org/10.1007/978-3-031-36336-8_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-36335-1
Online ISBN: 978-3-031-36336-8
eBook Packages: Computer ScienceComputer Science (R0)