short-paper

Classifying Tutor Discursive Moves at Scale in Mathematics Classrooms with Large Language Models

Authors:

Baptiste Moreau-Pernet,

Thomas ChristieAuthors Info & Claims

L@S '24: Proceedings of the Eleventh ACM Conference on Learning @ Scale

Pages 361 - 365

https://doi.org/10.1145/3657604.3664664

Published: 15 July 2024 Publication History

Abstract

In mathematics tutoring, using appropriate instructional discursive strategies, called "talk moves'', is critical to support student learning. Training tutors in the appropriate use of talk moves is a key component of tutor development programs. However, tutor development at scale is a challenge. Recent research has shown that automatic talk moves classification of tutorial discourse can facilitate large-scale delivery of personalized talk moves feedback. In this paper, we build on this work and share our current progress using large language models to classify talk moves in transcripts of tutoring sessions. We report classification results from fine-tuned models, prompt optimization, and supervised embedding vectors classification. The fine-tuned strategy performed best, yielding better performance (.87 macro and .93 weighted f1 score in predicting expert labels) than the current state-of-the-art RoBERTa model. We discuss trade-offs across methods and models.

References

[1]

National student support accelerator. toolkit for tutoring programs., 2024. https: //studentsupportaccelerator.org/tutoring, Last accessed on 2024-04-05.

[2]

H. Abburi, M. Suesserman, N. Pudota, B. Veeramani, E. Bowen, and S. Bhattacharya. Generative ai text classification using ensemble llm approaches. arXiv preprint arXiv:2309.07755, 2023.

[3]

S. Alic,D. Demszky, Z. Mancenido, J. Liu, H. Hill, andD. Jurafsky. Computationally identifying funneling and focusing questions in classroom discourse. arXiv preprint arXiv:2208.04715, 2022.

[4]

B. Booth, J. Jacobs, J. Bush, B. Milne, T. Fischaber, and S. DMello. Human-tutor coaching technology (htct): Automated discourse analytics in a coached tutoring model. pages 725--735, 03 2024.

[5]

T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, et al. Language models are few-shot learners. Advances in neural information processing systems, 33:1877--1901, 2020.

[6]

A. Candela, M. Boston, and J. Dixon. Discourse actions to promote student access. Mathematics Teacher: Learning and Teaching PK-12, 113:266--277, 04 2020.

[7]

C. Chen and K. Shu. Can llm-generated misinformation be detected? arXiv preprint arXiv:2309.13788, 2023.

[8]

D. Demszky and H. Hill. The ncte transcripts: A dataset of elementary math classroom transcripts. arXiv preprint arXiv:2211.11772, 2022.

[9]

D. Demszky, J. Liu, H. C. Hill, D. Jurafsky, and C. Piech. Can automated feedback improve teachers' uptake of student ideas? evidence from a randomized controlled trial in a large-scale online course. Educational Evaluation and Policy Analysis, 0(0):01623737231169270, 0.

[10]

S. Do, E. Ollion, and R. Shen. The augmented social scientist: Using sequential transfer learning to annotate millions of texts with human-level accuracy. Sociological Methods & Research, page 004912412211345, 12 2022.

[11]

H. Fei, B. Li, Q. Liu, L. Bing, F. Li, and T.-S. Chua. Reasoning implicit sentiment with chain-of-thought prompting. arXiv preprint arXiv:2305.11255, 2023.

[12]

M. Franke, A. Turrou, N. Webb, M. Ing, J. Wong, N. Shin, and C. Fernandez. Student engagement with others' mathematical ideas. The Elementary School Journal, 116:126--148, 09 2015.

[13]

X. He, Z. Lin, Y. Gong, A. Jin, H. Zhang, C. Lin, J. Jiao, S. M. Yiu, N. Duan, W. Chen, et al. Annollm: Making large language models to be better crowdsourced annotators. arXiv preprint arXiv:2303.16854, 2023.

[14]

J. Jacobs, K. Scornavacco, C. Harty, A. Suresh, V. Lai, and T. Sumner. Promoting rich discussions in mathematics classrooms: Using personalized, automated feedback to support reflection and instructional change. Teaching and Teacher Education, 112:103631, 2022.

[15]

E. Jensen, M. Dale, P. Donnelly, C. Stone, S. Kelly, A. Godley, and S. D'Mello. Toward automated feedback on teacher discourse to enhance teacher learning. pages 1--13, 04 2020.

Digital Library

[16]

O. Khattab, A. Singhvi, P. Maheshwari, Z. Zhang, K. Santhanam, S. Vardhamanan, S. Haq, A. Sharma, T. T. Joshi, H. Moazam, et al. Dspy: Compiling declarative language model calls into self-improving pipelines. arXiv preprint arXiv:2310.03714, 2023.

[17]

S. Lei, G. Dong, X. Wang, K. Wang, and S. Wang. Instructerc: Reforming emotion recognition in conversation with a retrieval multi-task llms framework. arXiv preprint arXiv:2309.11911, 2023.

[18]

Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692, 2019.

[19]

S. Michaels, C. O'Connor, and L. Resnick. Deliberative discourse idealized and realized: Accountable talk in the classroom and in civic life. Stud Philos Educ, 27:283--297, 07 2008.

[20]

C. O'Connor, S. Michaels, and S. Chapin. "Scaling Down" to Explore the Role of Talk in Learning: From District Intervention to Controlled Classroom Study, pages 111--126. 04 2015.

[21]

OpenAI. Embeddings, 2024. https://platform.openai.com/docs/guides/ embeddings, Last accessed on 2024-04--15.

[22]

A. Petukhova, J. P. Matos-Carvalho, and N. Fachada. Text clustering with llm embeddings. arXiv preprint arXiv:2403.15112, 2024.

[23]

S. Pugh, S. K. Subburaj, A. R. Rao, A. E. Stewart, J. Andrews-Todd, and S. K. D'Mello. Say what? automatic modeling of collaborative problem solving skills from student speech in the wild. Proceedings of The 14th International Conference on Educational Data Mining.

[24]

X. Sun, X. Li, J. Li, F. Wu, S. Guo, T. Zhang, and G. Wang. Text classification via large language models. arXiv preprint arXiv:2305.08377, 2023.

[25]

A. Suresh, J. Jacobs, C. Harty, M. Perkoff, J. H. Martin, and T. Sumner. The talkmoves dataset: K-12 mathematics lesson transcripts annotated for teacher and student discursive moves. arXiv preprint arXiv:2204.09652, 2022.

[26]

A. Suresh, J. Jacobs, V. Lai, C. Tan, W. Ward, J. H. Martin, and T. Sumner. Using transformers to provide teachers with personalized feedback on their classroom discourse: The talkmoves application. arXiv preprint arXiv:2105.07949, 2021.

[27]

A. Suresh, J. Jacobs, M. Perkoff, J. H. Martin, and T. Sumner. Fine-tuning transformers with additional context to classify discursive moves in mathematics classrooms. In E. Kochmar, J. Burstein, A. Horbach, R. Laarmann-Quante, N. Madnani, A. Tack, V. Yaneva, Z. Yuan, and T. Zesch, editors, Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2022), pages 71--81, Seattle,Washington, July 2022. Association for Computational Linguistics.

[28]

C. Wang, P. Nulty, and D. Lillis. A comparative study on word embeddings in deep learning for text classification. In Proceedings of the 4th International Conference on Natural Language Processing and Information Retrieval, NLPIR '20, page 37--46, New York, NY, USA, 2021. Association for Computing Machinery.

[29]

D. Wang, D. Shan, Y. Zheng, K. Guo, G. Chen, and Y. Lu. Can chatgpt detect student talk moves in classroom discourse? a preliminary comparison with bert. 07 2023.

[30]

J. Wei, X. Wang, D. Schuurmans, M. Bosma, F. Xia, E. Chi, Q. V. Le, D. Zhou, et al. Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems, 35:24824--24837, 2022.

[31]

M. Wolf, A. Crosson, and L. Resnick. Classroom talk for rigorous reading comprehension instruction. Reading Psychology, 26:27--53, 03 2005.

[32]

T. Wu, M. Terry, and C. J. Cai. Ai chains: Transparent and controllable human-ai interaction by chaining large language model prompts. In Proceedings of the 2022 CHI conference on human factors in computing systems, pages 1--22, 2022.

Digital Library

[33]

J. Zhang, P. Lertvittayakumjorn, and Y. Guo. Integrating semantic knowledge to tackle zero-shot text classification. In J. Burstein, C. Doran, and T. Solorio, editors, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 1031--1040, Minneapolis, Minnesota, June 2019. Association for Computational Linguistics.

Index Terms

Classifying Tutor Discursive Moves at Scale in Mathematics Classrooms with Large Language Models
1. Applied computing
  1. Education
    1. Computer-assisted instruction

Recommendations

Exploring Mixed-Initiative Dialogue Using Computer Dialogue Simulation

This paper experimentally shows that mixed-initiative dialogue is not always more efficient than non-mixed initiative dialogue in route finding tasks. Based on the dialogue model proposed in Conversation Analysis and Discourse Analysis a lá the ...
Linguistic politeness and face-work in computer-mediated communication, Part 1: A theoretical framework

Our daily social interaction is anchored in interpersonal discourse; accordingly, the phenomenon of linguistic politeness is prevalent in daily social interaction. Such linguistic behavior underscores the fact that linguistic politeness is a critical ...
Extending boosting for large scale spoken language understanding

We propose three methods for extending the Boosting family of classifiers motivated by the real-life problems we have encountered. First, we propose a semisupervised learning method for exploiting the unlabeled data in Boosting. We then present a novel ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

L@S '24: Proceedings of the Eleventh ACM Conference on Learning @ Scale

July 2024

582 pages

ISBN:9798400706332

DOI:10.1145/3657604

General Chair:
David Joyner
Georgia Tech, USA
,
Program Chairs:
Min Kyu Kim
Georgia State University, USA
,
Xu Wang
University of Michigan, USA
,
Meng Xia
Texas A&M University, USA

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 July 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Conference

L@S '24

L@S '24: Eleventh ACM Conference on Learning @ Scale

July 18 - 20, 2024

GA, Atlanta, USA

Acceptance Rates

Overall Acceptance Rate 117 of 440 submissions, 27%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
119
Total Downloads

Downloads (Last 12 months)119
Downloads (Last 6 weeks)12

Reflects downloads up to 25 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten