Assessment in Conversational Intelligent Tutoring Systems: Are Contextual Embeddings Really Better?

Carmon, Colin M.; Hu, Xiangen; Graesser, Arthur C.

doi:10.1007/978-3-031-36336-8_19

Colin M. Carmon^10,11,
Xiangen Hu^10,11 &
Arthur C. Graesser^10,11

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1831))

Included in the following conference series:

International Conference on Artificial Intelligence in Education

4349 Accesses

Abstract

This research investigates the ability of semantic text models to assess student responses during tutoring compared with expert human judges. Recent interest in text similarity has led to a proliferation of models that can potentially be used for assessing student responses; however, whether these models perform as well as traditional distributional semantic models like Latent Semantic Analysis for student response assessment in automatic short answer grading is unclear. We assessed 5166 response pairings of 219 participants across 118 electronics questions and scored each with 13 different computational text models, including models that use regular expressions, distributional semantics, word embeddings, contextual embeddings, and combinations of these features. We show a few semantic text models performing comparably to Latent Semantic Analysis, and in some cases outperforming the model. Furthermore, combination models outperformed individual models in agreement with human judges. Choosing appropriate computational techniques and optimizing the text model may continue to improve the accuracy, recall, weighted agreement and therefore, the effectiveness of conversational ITSs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 109.00; Price excludes VAT (USA)

Softcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bowman, S.R., Angeli, G., Potts, C., Manning, C.D.: A large annotated corpus for learning natural language inference. arXiv preprint arXiv:1508.05326 (2015)
Brown, T., et al.: Language models are few-shot learners. In: Advances in Neural Information Processing Systems, vol. 33, pp. 1877–1901 (2020)
Google Scholar
Burrows, S., Gurevych, I., Stein, B.: The eras and trends of automatic short answer grading. Int. J. Artif. Intell. Educ. 25(1), 60–117 (2014). https://doi.org/10.1007/s40593-014-0026-8
Article Google Scholar
Carmon, C.M.: Semantic matching evaluation: optimizing models for agreement between humans and AutoTutor. Electronic Theses and Dissertations. 2148 (2021). https://digitalcommons.memphis.edu/etd/2148
Carmon, C.M., Hampton, A.J., Morgan, B., Cai, Z., Wang, L., Graesser, A.C.: Semantic matching evaluation of user responses to electronics questions in AutoTutor. In: Sixth (2019) ACM Conference on Learning @ Scale. ACM, Chicago (2019). https://doi.org/10.1145/3330430.3333649
Carmon, C., Morgan, B., Hampton, A.J., Cai, Z., Graesser, A.C.: Semantic matching evaluation in ElectronixTutor. In: Proceedings of the 11th International Conference on Educational Data Mining, Buffalo, NY, pp. 580–583 (2018)
Google Scholar
Condor, A., Litster, M., Pardos, Z.A.: Automatic short answer grading with SBERT on out-of-sample questions. In: Hsiao, S., Sahebi, S. (eds.) Proceedings of the 14th International Conference on Educational Data Mining, pp. 345–352. EDM (2021)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Dumais, S.T., Furnas, G.W., Landauer, T.K., Deerwester, S., Harshman, R.: Using latent semantic analysis to improve access to textual information. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 1988, pp. 281–285. ACM Inc., New York, NY, USA. https://doi.org/10.1145/57167.57214 (1988)
Evens, M.W., et al.: CIRCSIM-tutor: an intelligent tutoring system using natural language dialogue. In: Proceedings of the 12th Midwest Artificial Intelligence and Cognitive Science Conference, Oxford, pp. 16–23 (2001)
Google Scholar
Furnas, G.W., Landauer, T.K., Gomez, U.M., Dumais, S.T.: Statistical semantics: analysis of the potential performance of key-word information systems. Bell Syst. Tech. J. 62(6), 1753–1806 (1983)
Article Google Scholar
Graesser, A.C.: Conversations with AutoTutor help students learn. Int. J. Artif. Intell. Educ. 26, 124–132 (2016)
Article Google Scholar
Graesser, A.C.: Learning science principles and technologies with agents that promote deep learning. In: Learning Science: Theory, Research, and Practice, pp. 2–33. McGraw-Hill, New York (2020)
Google Scholar
Graesser, A.C., et al.: ElectronixTutor: an intelligent tutoring system with multiple learning resources for electronics. Int. J. STEM Educ. Innov. Res. 5(1) (2018). https://doi.org/10.1186/s40594-017-0072-5
Graesser, A.C., et al.: AutoTutor: a tutor with dialogue in natural language. Behav. Res. Methods Instrum. Comput. 36, 180–193 (2004)
Article Google Scholar
Hill, F., Cho, K., Korhonen, A.: Learning distributed representations of sentences from unlabelled data. arXiv preprint arXiv:1602.03483 (2016)
Jurafsky, D., Martin, J.: Speech and Language Processing. Prentice Hall, Englewood (2008)
Google Scholar
Landauer, T.K., McNamara, D.S., Dennis, S., Kintsch, W.: Handbook of Latent Semantic Analysis. Erlbaum, Mahwah, NJ (2007)
Book Google Scholar
Mikolov, T., Chen, K., Corrado, G.S., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings of Workshop at ICLR (2013)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, vol. 26 (2013)
Google Scholar
Muennighoff, N.: SGPT: GPT sentence embeddings for semantic search. arXiv preprint arXiv:2202.08904 (2022)
Nazaretsky, T., Hershkovitz, S., Alexandron, G.: Kappa learning: a new item-similarity method for clustering educational items from response data. In: Proceedings of the 12th International Conference on Educational Data Mining, EDM 2019, pp. 129–138 (2019)
Google Scholar
Nye, B.D., Core, M., Swartout, B., Hu, X., Morgan, B., Graesser, A.: ElectronixTutor content and system testing to support adaptive learning for nuclear field electronics. Final report (2022)
Google Scholar
Nye, B.D., Graesser, A.C., Hu, X.: AutoTutor and family: a review of 17 years of natural language tutoring. Int. J. Artif. Intell. Educ. 24(4), 427–469 (2014)
Article Google Scholar
Olney, A.M., et al.: Guru: a computer tutor that models expert human tutors. In: Cerri, S.A., Clancey, W.J., Papadourakis, G., Panourgia, K. (eds.) ITS 2012. LNCS, vol. 7315, pp. 256–261. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-30950-2_32
Chapter Google Scholar
Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using siamese BERT-networks. arXiv preprint arXiv:1908.10084 (2019)
VanLehn, K., Graesser, A.C., Jackson, G.T., Jordan, P., Olney, A., Rose, C.P.: When are tutorial dialogues more effective than reading? Cogn. Sci. 31, 3–62 (2007)
Article Google Scholar
VanLehn, K., et al.: The architecture of Why2-Atlas: a coach for qualitative physics essay writing. In: Cerri, S.A., Gouardères, G., Paraguaçu, F. (eds.) Intelligent Tutoring Systems, vol. 2363, pp. 158–167. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-47987-2_20
Williams, A., Nangia, N., Bowman, S.R.: A broad-coverage challenge corpus for sentence understanding through inference. arXiv preprint arXiv:1704.05426 (2017)
Zhang, Y., He, R., Liu, Z., Lim, K.H., Bing, L.: An unsupervised sentence embedding method by mutual information maximization. arXiv preprint arXiv:2009.12061 (2020)

Download references

Acknowledgments

This research was supported by the Office of Naval Research (N00014-00-1-0600, N00014-15-P-1184; N00014-12-C-0643; N00014-16-C-3027) and the National Science Foundation Data Infrastructure Building Blocks program (ACI-1443068). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of ONR or NSF.

Author information

Authors and Affiliations

University of Memphis Psychology, Memphis, TN, 38152, USA
Colin M. Carmon, Xiangen Hu & Arthur C. Graesser
Institute for Intelligent Systems, University of Memphis, Memphis, TN, 38152, USA
Colin M. Carmon, Xiangen Hu & Arthur C. Graesser

Authors

Colin M. Carmon
View author publications
You can also search for this author in PubMed Google Scholar
Xiangen Hu
View author publications
You can also search for this author in PubMed Google Scholar
Arthur C. Graesser
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Colin M. Carmon .

Editor information

Editors and Affiliations

University of Southern California, Los Angeles, CA, USA
Ning Wang
University of British Columbia, Vancouver, BC, Canada
Genaro Rebolledo-Mendez
University of Leeds, Leeds, UK
Vania Dimitrova
North Carolina State University, Raleigh, NC, USA
Noboru Matsuda
UNED, Madrid, Spain
Olga C. Santos

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Carmon, C.M., Hu, X., Graesser, A.C. (2023). Assessment in Conversational Intelligent Tutoring Systems: Are Contextual Embeddings Really Better?. In: Wang, N., Rebolledo-Mendez, G., Dimitrova, V., Matsuda, N., Santos, O.C. (eds) Artificial Intelligence in Education. Posters and Late Breaking Results, Workshops and Tutorials, Industry and Innovation Tracks, Practitioners, Doctoral Consortium and Blue Sky. AIED 2023. Communications in Computer and Information Science, vol 1831. Springer, Cham. https://doi.org/10.1007/978-3-031-36336-8_19

Download citation

DOI: https://doi.org/10.1007/978-3-031-36336-8_19
Published: 30 June 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-36335-1
Online ISBN: 978-3-031-36336-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics