Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1831))

Included in the following conference series:

  • 4349 Accesses

Abstract

This research investigates the ability of semantic text models to assess student responses during tutoring compared with expert human judges. Recent interest in text similarity has led to a proliferation of models that can potentially be used for assessing student responses; however, whether these models perform as well as traditional distributional semantic models like Latent Semantic Analysis for student response assessment in automatic short answer grading is unclear. We assessed 5166 response pairings of 219 participants across 118 electronics questions and scored each with 13 different computational text models, including models that use regular expressions, distributional semantics, word embeddings, contextual embeddings, and combinations of these features. We show a few semantic text models performing comparably to Latent Semantic Analysis, and in some cases outperforming the model. Furthermore, combination models outperformed individual models in agreement with human judges. Choosing appropriate computational techniques and optimizing the text model may continue to improve the accuracy, recall, weighted agreement and therefore, the effectiveness of conversational ITSs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bowman, S.R., Angeli, G., Potts, C., Manning, C.D.: A large annotated corpus for learning natural language inference. arXiv preprint arXiv:1508.05326 (2015)

  2. Brown, T., et al.: Language models are few-shot learners. In: Advances in Neural Information Processing Systems, vol. 33, pp. 1877–1901 (2020)

    Google Scholar 

  3. Burrows, S., Gurevych, I., Stein, B.: The eras and trends of automatic short answer grading. Int. J. Artif. Intell. Educ. 25(1), 60–117 (2014). https://doi.org/10.1007/s40593-014-0026-8

    Article  Google Scholar 

  4. Carmon, C.M.: Semantic matching evaluation: optimizing models for agreement between humans and AutoTutor. Electronic Theses and Dissertations. 2148 (2021). https://digitalcommons.memphis.edu/etd/2148

  5. Carmon, C.M., Hampton, A.J., Morgan, B., Cai, Z., Wang, L., Graesser, A.C.: Semantic matching evaluation of user responses to electronics questions in AutoTutor. In: Sixth (2019) ACM Conference on Learning @ Scale. ACM, Chicago (2019). https://doi.org/10.1145/3330430.3333649

  6. Carmon, C., Morgan, B., Hampton, A.J., Cai, Z., Graesser, A.C.: Semantic matching evaluation in ElectronixTutor. In: Proceedings of the 11th International Conference on Educational Data Mining, Buffalo, NY, pp. 580–583 (2018)

    Google Scholar 

  7. Condor, A., Litster, M., Pardos, Z.A.: Automatic short answer grading with SBERT on out-of-sample questions. In: Hsiao, S., Sahebi, S. (eds.) Proceedings of the 14th International Conference on Educational Data Mining, pp. 345–352. EDM (2021)

    Google Scholar 

  8. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  9. Dumais, S.T., Furnas, G.W., Landauer, T.K., Deerwester, S., Harshman, R.: Using latent semantic analysis to improve access to textual information. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 1988, pp. 281–285. ACM Inc., New York, NY, USA. https://doi.org/10.1145/57167.57214 (1988)

  10. Evens, M.W., et al.: CIRCSIM-tutor: an intelligent tutoring system using natural language dialogue. In: Proceedings of the 12th Midwest Artificial Intelligence and Cognitive Science Conference, Oxford, pp. 16–23 (2001)

    Google Scholar 

  11. Furnas, G.W., Landauer, T.K., Gomez, U.M., Dumais, S.T.: Statistical semantics: analysis of the potential performance of key-word information systems. Bell Syst. Tech. J. 62(6), 1753–1806 (1983)

    Article  Google Scholar 

  12. Graesser, A.C.: Conversations with AutoTutor help students learn. Int. J. Artif. Intell. Educ. 26, 124–132 (2016)

    Article  Google Scholar 

  13. Graesser, A.C.: Learning science principles and technologies with agents that promote deep learning. In: Learning Science: Theory, Research, and Practice, pp. 2–33. McGraw-Hill, New York (2020)

    Google Scholar 

  14. Graesser, A.C., et al.: ElectronixTutor: an intelligent tutoring system with multiple learning resources for electronics. Int. J. STEM Educ. Innov. Res. 5(1) (2018). https://doi.org/10.1186/s40594-017-0072-5

  15. Graesser, A.C., et al.: AutoTutor: a tutor with dialogue in natural language. Behav. Res. Methods Instrum. Comput. 36, 180–193 (2004)

    Article  Google Scholar 

  16. Hill, F., Cho, K., Korhonen, A.: Learning distributed representations of sentences from unlabelled data. arXiv preprint arXiv:1602.03483 (2016)

  17. Jurafsky, D., Martin, J.: Speech and Language Processing. Prentice Hall, Englewood (2008)

    Google Scholar 

  18. Landauer, T.K., McNamara, D.S., Dennis, S., Kintsch, W.: Handbook of Latent Semantic Analysis. Erlbaum, Mahwah, NJ (2007)

    Book  Google Scholar 

  19. Mikolov, T., Chen, K., Corrado, G.S., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings of Workshop at ICLR (2013)

    Google Scholar 

  20. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, vol. 26 (2013)

    Google Scholar 

  21. Muennighoff, N.: SGPT: GPT sentence embeddings for semantic search. arXiv preprint arXiv:2202.08904 (2022)

  22. Nazaretsky, T., Hershkovitz, S., Alexandron, G.: Kappa learning: a new item-similarity method for clustering educational items from response data. In: Proceedings of the 12th International Conference on Educational Data Mining, EDM 2019, pp. 129–138 (2019)

    Google Scholar 

  23. Nye, B.D., Core, M., Swartout, B., Hu, X., Morgan, B., Graesser, A.: ElectronixTutor content and system testing to support adaptive learning for nuclear field electronics. Final report (2022)

    Google Scholar 

  24. Nye, B.D., Graesser, A.C., Hu, X.: AutoTutor and family: a review of 17 years of natural language tutoring. Int. J. Artif. Intell. Educ. 24(4), 427–469 (2014)

    Article  Google Scholar 

  25. Olney, A.M., et al.: Guru: a computer tutor that models expert human tutors. In: Cerri, S.A., Clancey, W.J., Papadourakis, G., Panourgia, K. (eds.) ITS 2012. LNCS, vol. 7315, pp. 256–261. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-30950-2_32

    Chapter  Google Scholar 

  26. Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using siamese BERT-networks. arXiv preprint arXiv:1908.10084 (2019)

  27. VanLehn, K., Graesser, A.C., Jackson, G.T., Jordan, P., Olney, A., Rose, C.P.: When are tutorial dialogues more effective than reading? Cogn. Sci. 31, 3–62 (2007)

    Article  Google Scholar 

  28. VanLehn, K., et al.: The architecture of Why2-Atlas: a coach for qualitative physics essay writing. In: Cerri, S.A., Gouardères, G., Paraguaçu, F. (eds.) Intelligent Tutoring Systems, vol. 2363, pp. 158–167. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-47987-2_20

  29. Williams, A., Nangia, N., Bowman, S.R.: A broad-coverage challenge corpus for sentence understanding through inference. arXiv preprint arXiv:1704.05426 (2017)

  30. Zhang, Y., He, R., Liu, Z., Lim, K.H., Bing, L.: An unsupervised sentence embedding method by mutual information maximization. arXiv preprint arXiv:2009.12061 (2020)

Download references

Acknowledgments

This research was supported by the Office of Naval Research (N00014-00-1-0600, N00014-15-P-1184; N00014-12-C-0643; N00014-16-C-3027) and the National Science Foundation Data Infrastructure Building Blocks program (ACI-1443068). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of ONR or NSF.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Colin M. Carmon .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Carmon, C.M., Hu, X., Graesser, A.C. (2023). Assessment in Conversational Intelligent Tutoring Systems: Are Contextual Embeddings Really Better?. In: Wang, N., Rebolledo-Mendez, G., Dimitrova, V., Matsuda, N., Santos, O.C. (eds) Artificial Intelligence in Education. Posters and Late Breaking Results, Workshops and Tutorials, Industry and Innovation Tracks, Practitioners, Doctoral Consortium and Blue Sky. AIED 2023. Communications in Computer and Information Science, vol 1831. Springer, Cham. https://doi.org/10.1007/978-3-031-36336-8_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-36336-8_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-36335-1

  • Online ISBN: 978-3-031-36336-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics