Most semantic role labeling (SRL) research has been focused on training and evaluating on the sam... more Most semantic role labeling (SRL) research has been focused on training and evaluating on the same corpus. This strategy, although appropriate for initiating research, can lead to overtraining to the particular corpus. This article describes the operation of ASSERT, a state-of-the art SRL system, and analyzes the robustness of the system when trained on one genre of data and used to label a different genre. As a starting point, results are first presented for training and testing the system on the PropBank corpus, which is annotated Wall Street Journal (WSJ) data. Experiments are then presented to evaluate the portability of the system to another source of data. These experiments are based on comparisons of performance using PropBanked WSJ data and PropBanked Brown Corpus data. The results indicate that whereas syntactic parses and argument identification transfer relatively well to a new corpus, argument classification does not. An analysis of the reasons for this is presented and ...
Currently, providing teachers with detailed feedback about their classroom discourse strategies r... more Currently, providing teachers with detailed feedback about their classroom discourse strategies requires highly trained observers to hand code transcripts of classroom recordings to identify talk moves and/or one-on-one expert coaching. Both approaches are time-consuming and expensive, require considerable human expertise, and do not scale to large numbers of teachers. We are currently developing an innovative application, the TalkBack application, a new type of teacher learning environment based on the automated analysis of classroom recordings. The TalkBack application will utilize a big data infrastructure for managing and analyzing classroom recordings, including an embedded automated talk move classifier. The application will provide teachers with a detailed record of the discourse strategies used in their lessons. A central premise of our research is that this type of personalized, automated feedback can dramatically enhance teacher learning and support improvements in their instruction.The project will exemplify how next-generation repositories of classroom recordings can be architected to support large-scale research by enabling automated analyses based on machine learning models.
Enhancing Learning Using Adaptive Computerized Tutoring in K-12 Settings Gautam Biswas 1 (gautam.... more Enhancing Learning Using Adaptive Computerized Tutoring in K-12 Settings Gautam Biswas 1 (gautam.biswas@vanderbilt.edu) Daniel Schwartz 2 & Kefyn M. Catley 3 Carol O’Donnell (Carol.O’Donnell@ed.gov) & Robin Harwood Institute of Education Sciences, U.S. Dept. of Education, Washington, DC 20208 Department of Computer Science Vanderbilt University, Nashville, TN 37240 School of Education, Stanford University Department of Biology, Western Carolina University Barry Gholson (b.gholson@mail.psyc.memphis.edu) Art Graesser & Scotty D. Craig Stephanie Siler (siler@andrew.cmu.edu) Department of Psychology The University of Memphis, Memphis, TN 38152 Department of Psychology Carnegie Mellon University Pittsburgh, PA 15213 Wayne Ward 1 (Wayne.Ward@Colorado.edu) & Ronald Cole 2 Center for Computational Language & Ed. Research, University of Colorado, Boulder, CO 80309 Boulder Language Technologies Keywords: Adaptive computerized tutoring; dialog; self- regulated learning; deep-level reasoning; s...
Disease progression and understanding relies on temporal concepts. Discovery of automated tempora... more Disease progression and understanding relies on temporal concepts. Discovery of automated temporal relations and timelines from the clinical narrative allows for mining large data sets of clinical text to uncover patterns at the disease and patient level. Our overall goal is the complex task of building a system for automated temporal relation discovery. As a first step, we evaluate enabling methods from the general natural language processing domain - deep parsing and semantic role labeling in predicate-argument structures - to explore their portability to the clinical domain. As a second step, we develop an annotation schema for temporal relations based on TimeML. In this paper we report results and findings from these first steps. Our next efforts will scale up the data collection to develop domain-specific modules for the enabling technologies within Mayo’s open-source clinical Text Analysis and Knowledge Extraction System.
Proceedings of the Seventh Workshop on Building Educational Applications Using Nlp, Jun 7, 2012
ABSTRACT A key challenge for dialogue-based intelligent tutoring systems lies in selecting follow... more ABSTRACT A key challenge for dialogue-based intelligent tutoring systems lies in selecting follow-up questions that are not only context relevant but also encourage self-expression and stimulate learning. This paper presents an approach to ranking candidate questions for a given dialogue context and introduces an evaluation framework for this task. We learn to rank using judgments collected from expert human tutors, and we show that adding features derived from a rich, multi-layer dialogue act representation improves system performance over baseline lexical and syntactic features to a level in agreement with the judges. The experimental results highlight the important factors in modeling the questioning process. This work provides a framework for future work in automatic question generation and it represents a step toward the larger goal of directly learning tutorial dialogue policies directly from human examples.
Page 1. An Integrated Speech and Natural Language Dialog System: Using Dialog Knowledge in Speech... more Page 1. An Integrated Speech and Natural Language Dialog System: Using Dialog Knowledge in Speech Recognition Sheryl R. Young Wayne H. Ward Alexander G. Hauptmann and Zongge Li Computer Science Department Carnegie Mellon University ...
Most semantic role labeling (SRL) research has been focused on training and evaluating on the sam... more Most semantic role labeling (SRL) research has been focused on training and evaluating on the same corpus. This strategy, although appropriate for initiating research, can lead to overtraining to the particular corpus. This article describes the operation of ASSERT, a state-of-the art SRL system, and analyzes the robustness of the system when trained on one genre of data and used to label a different genre. As a starting point, results are first presented for training and testing the system on the PropBank corpus, which is annotated Wall Street Journal (WSJ) data. Experiments are then presented to evaluate the portability of the system to another source of data. These experiments are based on comparisons of performance using PropBanked WSJ data and PropBanked Brown Corpus data. The results indicate that whereas syntactic parses and argument identification transfer relatively well to a new corpus, argument classification does not. An analysis of the reasons for this is presented and ...
Currently, providing teachers with detailed feedback about their classroom discourse strategies r... more Currently, providing teachers with detailed feedback about their classroom discourse strategies requires highly trained observers to hand code transcripts of classroom recordings to identify talk moves and/or one-on-one expert coaching. Both approaches are time-consuming and expensive, require considerable human expertise, and do not scale to large numbers of teachers. We are currently developing an innovative application, the TalkBack application, a new type of teacher learning environment based on the automated analysis of classroom recordings. The TalkBack application will utilize a big data infrastructure for managing and analyzing classroom recordings, including an embedded automated talk move classifier. The application will provide teachers with a detailed record of the discourse strategies used in their lessons. A central premise of our research is that this type of personalized, automated feedback can dramatically enhance teacher learning and support improvements in their instruction.The project will exemplify how next-generation repositories of classroom recordings can be architected to support large-scale research by enabling automated analyses based on machine learning models.
Enhancing Learning Using Adaptive Computerized Tutoring in K-12 Settings Gautam Biswas 1 (gautam.... more Enhancing Learning Using Adaptive Computerized Tutoring in K-12 Settings Gautam Biswas 1 (gautam.biswas@vanderbilt.edu) Daniel Schwartz 2 & Kefyn M. Catley 3 Carol O’Donnell (Carol.O’Donnell@ed.gov) & Robin Harwood Institute of Education Sciences, U.S. Dept. of Education, Washington, DC 20208 Department of Computer Science Vanderbilt University, Nashville, TN 37240 School of Education, Stanford University Department of Biology, Western Carolina University Barry Gholson (b.gholson@mail.psyc.memphis.edu) Art Graesser & Scotty D. Craig Stephanie Siler (siler@andrew.cmu.edu) Department of Psychology The University of Memphis, Memphis, TN 38152 Department of Psychology Carnegie Mellon University Pittsburgh, PA 15213 Wayne Ward 1 (Wayne.Ward@Colorado.edu) & Ronald Cole 2 Center for Computational Language & Ed. Research, University of Colorado, Boulder, CO 80309 Boulder Language Technologies Keywords: Adaptive computerized tutoring; dialog; self- regulated learning; deep-level reasoning; s...
Disease progression and understanding relies on temporal concepts. Discovery of automated tempora... more Disease progression and understanding relies on temporal concepts. Discovery of automated temporal relations and timelines from the clinical narrative allows for mining large data sets of clinical text to uncover patterns at the disease and patient level. Our overall goal is the complex task of building a system for automated temporal relation discovery. As a first step, we evaluate enabling methods from the general natural language processing domain - deep parsing and semantic role labeling in predicate-argument structures - to explore their portability to the clinical domain. As a second step, we develop an annotation schema for temporal relations based on TimeML. In this paper we report results and findings from these first steps. Our next efforts will scale up the data collection to develop domain-specific modules for the enabling technologies within Mayo’s open-source clinical Text Analysis and Knowledge Extraction System.
Proceedings of the Seventh Workshop on Building Educational Applications Using Nlp, Jun 7, 2012
ABSTRACT A key challenge for dialogue-based intelligent tutoring systems lies in selecting follow... more ABSTRACT A key challenge for dialogue-based intelligent tutoring systems lies in selecting follow-up questions that are not only context relevant but also encourage self-expression and stimulate learning. This paper presents an approach to ranking candidate questions for a given dialogue context and introduces an evaluation framework for this task. We learn to rank using judgments collected from expert human tutors, and we show that adding features derived from a rich, multi-layer dialogue act representation improves system performance over baseline lexical and syntactic features to a level in agreement with the judges. The experimental results highlight the important factors in modeling the questioning process. This work provides a framework for future work in automatic question generation and it represents a step toward the larger goal of directly learning tutorial dialogue policies directly from human examples.
Page 1. An Integrated Speech and Natural Language Dialog System: Using Dialog Knowledge in Speech... more Page 1. An Integrated Speech and Natural Language Dialog System: Using Dialog Knowledge in Speech Recognition Sheryl R. Young Wayne H. Ward Alexander G. Hauptmann and Zongge Li Computer Science Department Carnegie Mellon University ...
Uploads
Papers by Wayne Ward