Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2818346.2830602acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
research-article

Multimodal Capture of Teacher-Student Interactions for Automated Dialogic Analysis in Live Classrooms

Published: 09 November 2015 Publication History

Abstract

We focus on data collection designs for the automated analysis of teacher-student interactions in live classrooms with the goal of identifying instructional activities (e.g., lecturing, discussion) and assessing the quality of dialogic instruction (e.g., analysis of questions). Our designs were motivated by multiple technical requirements and constraints. Most importantly, teachers could be individually micfied but their audio needed to be of excellent quality for automatic speech recognition (ASR) and spoken utterance segmentation. Individual students could not be micfied but classroom audio quality only needed to be sufficient to detect student spoken utterances. Visual information could only be recorded if students could not be identified. Design 1 used an omnidirectional laptop microphone to record both teacher and classroom audio and was quickly deemed unsuitable. In Designs 2 and 3, teachers wore a wireless Samson AirLine 77 vocal headset system, which is a unidirectional microphone with a cardioid pickup pattern. In Design 2, classroom audio was recorded with dual first- generation Microsoft Kinects placed at the front corners of the class. Design 3 used a Crown PZM-30D pressure zone microphone mounted on the blackboard to record classroom audio. Designs 2 and 3 were tested by recording audio in 38 live middle school classrooms from six U.S. schools while trained human coders simultaneously performed live coding of classroom discourse. Qualitative and quantitative analyses revealed that Design 3 was suitable for three of our core tasks: (1) ASR on teacher speech (word recognition rate of 66% and word overlap rate of 69% using Google Speech ASR engine); (2) teacher utterance segmentation (F-measure of 97%); and (3) student utterance segmentation (F-measure of 66%). Ideas to incorporate video and skeletal tracking with dual second-generation Kinects to produce Design 4 are discussed.

References

[1]
Alibali, M.W., Nathan, M.J., Wolfgram, M.S., Church, R.B., Jacobs, S.A., Johnson Martinez, C. and Knuth, E.J. 2014. How teachers link ideas in mathematics instruction using speech and gesture: A corpus analysis. Cognition and Instruction, 32 (1), 65--100.
[2]
Applebee, A.N., Langer, J.A., Nystrand, M. and Gamoran, A. 2003. Discussion-based approaches to developing understanding: Classroom instruction and student performance in middle and high school English. American Educational Research Journal, 40 (3), 685--730.
[3]
Blanchard, N., Brady, M., Olney, A.M., Glaus, M., Sun, X., Nystrand, M., Samei, B., Kelly, S. and D'Mello, S. 2015. A Study of Automatic Speech Recognition in Noisy Classroom Environments for Automated Dialog Analysis. In Conati, C., Heffernan, N., Mitrovic, A. and Verdejo, M.F. eds. Artificial Intelligence in Education, Springer-Verlag, Berlin Heidelberg.
[4]
Blanchard, N., D'Mello, S., Nystrand, M. and Olney, A.M. 2015. Automatic Classification of Question & Answer Discourse Segments from Teacher's Speech in Classrooms. In Romero, C., Pechenizkiy, M., Boticario, J. and Santos, O. eds. Proceedings of the 8th International Conference on Educational Data Mining (EDM 2015), International Educational Data Mining Society.
[5]
Brady, M.C., D'Mello, S., Blanchard, N., Olney, A. and Nystrand, M. Year. Evaluating microphones and microphone placement for signal processing and automatic speech recognition of teacher-student dialog. In 168th meeting of the Acoustical Society of America, (Indianapolis, Indiana, 2014), 2215--2215.
[6]
Crown. 2006. PZM-30D PZM-60D.
[7]
Ford, M., Baer, C., Xu, D., Yapanel, U. and Gray, S. 2008. The LENA language environment analysis system, LENA Foundation Technical Report LTR-03-02., Boulder, CO.
[8]
Gamoran, A. and Nystrand, M. 1992. Taking students seriously. In Newmann, F.M. ed. Student engagement and achievement in american secondary schools, Teachers College Press, New York, NY.
[9]
Gates. 2013. Ensuring Fair and Reliable Measures of Effective Teaching: Culminating Findings from the MET Project's Three-Year Study, Bill & Melinda Gates Foundation.
[10]
Goffin, V., Allauzen, C., Bocchieri, E., Hakkani-Tür, D., Ljolje, A., Parthasarathy, S., Rahim, M.G., Riccardi, G. and Saraclar, M. Year. The AT&T WATSON Speech Recognizer. In International Conference on Acoustics, Speech and Signal Processing, (2005), 1033--1036.
[11]
Goldman, R., Pea, R., Barron, B. and Derry, S.J. (eds.). Video research in the learning sciences. Erlbaum, Mahwah, NJ.
[12]
Graesser, A. and Person, N. 1994. Question asking during tutoring. American Education Research Journal, 31 (1), 104137.
[13]
Kelly, S. 2008. Race, social class, and student engagement in middle school English classrooms. Social Science Research, 37 (2), 434--448.
[14]
LENA. 2015. LENA Research Foundation.
[15]
Marx, A., Fuhrer, U. and Hartig, T. 1999. Effects of classroom seating arrangements on children's question-asking. Learning Environments Research, 2 (3), 249--263.
[16]
Microsoft. 2010. Best Practices for Enabling Voice Recognition, Microsoft.
[17]
Microsoft. 2014. The Bing Speech Recognition Control
[18]
Microsoft. 2015. Kinect for Windows SDK MSDN.
[19]
Microsoft. 2014. Speech SDK 5.1.
[20]
NCES. 2015. Digest of Education Statistics, 2013, U.S. Department of Education, National Center for Education Statistics, Washington, DC.
[21]
Nystrand, M. 2004. Classroom Language Assessment System (CLASS) 4.24, University of Wisconsin-Madison, Madison, WI.
[22]
Nystrand, M. 1997. Opening Dialogue: Understanding the Dynamics of Language and Learning in the English Classroom. Language and Literacy Series. Teachers College Press, New York, NY.
[23]
Nystrand, M. and Gamoran, A. 1991. Instructional discourse, student engagement, and literature achievement. Research in the Teaching of English, 25 (3), 261--290.
[24]
Nystrand, M., Wu, L.L., Gamoran, A., Zeiser, S. and Long, D.A. 2003. Questions in time: Investigating the structure and dynamics of unfolding classroom discourse. Discourse Processes, 35 (2), 135--198.
[25]
Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., Hannemann, M., Motlíek, P., Qian, Y. and Schwarz, P. Year. The Kaldi speech recognition toolkit. In IEEE Workshop on Automatic Speech Recognition and Understanding, (2011).
[26]
Rouvier, M., Dupuy, G., Gay, P., Khoury, E., Merlin, T. and Meignier, S. Year. An open-source state-of-the-art toolbox for broadcast news diarization. In Interspeech, (2013).
[27]
Samei, B., Olney, A., Kelly, S., Nystrand, M., D'Mello, S., Blanchard, N., Sun, X., Glaus, M. and Graesser, A. 2014. Domain Independent Assessment of Dialogic Properties of Classroom Discourse. In Stamper, J., Pardos, Z., Mavrikis, M. and McLaren, B.M. eds. Proceedings of the 7th International Conference on Educational Data Mining (EDM 2014) International Educational Data Mining Society.
[28]
Schalkwyk, J., Beeferman, D., Beaufays, F., Byrne, B., Chelba, C., Cohen, M., Kamvar, M. and Strope, B. 2010. "Your Word is my Command": Google Search by Voice: A Case Study. In Neustein, A. ed. Advances in Speech Recognition: Mobile Environments, Call Centers, and Clinics, Springer US, New York, NY.
[29]
Walker, W., Lamere, P., Kwok, P., Raj, B., Singh, R., Gouvea, E., Wolf, P. and Woelfel, J. 2004. Sphinx-4: A flexible open source framework for speech recognition, Sun Microsystems, Inc, Mountain View, CA, USA.
[30]
Wang, Z., Miller, K. and Cortina, K. 2013. Using the LENA in Teacher Training: Promoting Student Involement through automated feedback. Unterrichtswissenschaft, 4, 290--305.
[31]
Wang, Z., Pan, X., Miller, K.F. and Cortina, K.S. 2014. Automatic classification of activities in classroom discourse. Computers & Education, 78 (1), 115--123.
[32]
Wiesler, S., Richard, A., Golik, P., Schluter, R. and Ney, H. 2014. RASR/NN: The RWTH neural network toolkit for speech recognition. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) IEEE, Washington, DC.

Cited By

View all
  • (2024)ClassInSight: Designing Conversation Support Tools to Visualize Classroom Discussion for Personalized Teacher Professional DevelopmentProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642487(1-15)Online publication date: 11-May-2024
  • (2024)Mining User-Object Interaction Data for Student Modeling in Intelligent Learning EnvironmentsProgramming and Computer Software10.1134/S036176882308008X49:8(657-670)Online publication date: 24-Jan-2024
  • (2024)How Well Can Tutoring Audio Be Autoclassified and Machine Explained With XAI: A Comparison of Three Types of MethodsIEEE Transactions on Learning Technologies10.1109/TLT.2024.338102817(1302-1312)Online publication date: 2024
  • Show More Cited By

Index Terms

  1. Multimodal Capture of Teacher-Student Interactions for Automated Dialogic Analysis in Live Classrooms

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        ICMI '15: Proceedings of the 2015 ACM on International Conference on Multimodal Interaction
        November 2015
        678 pages
        ISBN:9781450339124
        DOI:10.1145/2818346
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 09 November 2015

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. classroom discourse
        2. dialogic instruction
        3. multimodal

        Qualifiers

        • Research-article

        Funding Sources

        • Institute of Education Sciences

        Conference

        ICMI '15
        Sponsor:
        ICMI '15: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION
        November 9 - 13, 2015
        Washington, Seattle, USA

        Acceptance Rates

        ICMI '15 Paper Acceptance Rate 52 of 127 submissions, 41%;
        Overall Acceptance Rate 453 of 1,080 submissions, 42%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)63
        • Downloads (Last 6 weeks)2
        Reflects downloads up to 25 Jan 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)ClassInSight: Designing Conversation Support Tools to Visualize Classroom Discussion for Personalized Teacher Professional DevelopmentProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642487(1-15)Online publication date: 11-May-2024
        • (2024)Mining User-Object Interaction Data for Student Modeling in Intelligent Learning EnvironmentsProgramming and Computer Software10.1134/S036176882308008X49:8(657-670)Online publication date: 24-Jan-2024
        • (2024)How Well Can Tutoring Audio Be Autoclassified and Machine Explained With XAI: A Comparison of Three Types of MethodsIEEE Transactions on Learning Technologies10.1109/TLT.2024.338102817(1302-1312)Online publication date: 2024
        • (2024)Exploring AI Techniques for Generalizable Teaching Practice IdentificationIEEE Access10.1109/ACCESS.2024.345691512(134702-134713)Online publication date: 2024
        • (2024)Are perfect transcripts necessary when we analyze classroom dialogue using AIoT?Internet of Things10.1016/j.iot.2024.10110525(101105)Online publication date: Apr-2024
        • (2024)High School English Teachers Reflect on Their Talk: A Study of Response to Automated Feedback with the Teacher Talk ToolInternational Journal of Artificial Intelligence in Education10.1007/s40593-024-00417-xOnline publication date: 8-Jul-2024
        • (2023)Lessons Learnt from a Multimodal Learning Analytics Deployment In-the-WildACM Transactions on Computer-Human Interaction10.1145/362278431:1(1-41)Online publication date: 29-Nov-2023
        • (2023)A Comparative Analysis of Automatic Speech Recognition Errors in Small Group Classroom DiscourseProceedings of the 31st ACM Conference on User Modeling, Adaptation and Personalization10.1145/3565472.3595606(250-262)Online publication date: 18-Jun-2023
        • (2023)AI-driven Teacher Analytics: Informative Insights on Classroom Activities2023 IEEE International Conference on Teaching, Assessment and Learning for Engineering (TALE)10.1109/TALE56641.2023.10398309(1-8)Online publication date: 28-Nov-2023
        • (2023)Toward Automated Classroom Observation: Multimodal Machine Learning to Estimate CLASS Positive Climate and Negative ClimateIEEE Transactions on Affective Computing10.1109/TAFFC.2021.305920914:1(664-679)Online publication date: 1-Jan-2023
        • Show More Cited By

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media