Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- ArticleOctober 2005
Learning mixed initiative dialog strategies by using reinforcement learning on both conversants
HLT '05: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language ProcessingPages 1011–1018https://doi.org/10.3115/1220575.1220702This paper describes an application of reinforcement learning to determine a dialog policy for a complex collaborative task where policies for both the system and a proxy for a user of the system are learned simultaneously. With this approach a useful ...
- ArticleOctober 2005
Speech-based information retrieval system with clarification dialogue strategy
HLT '05: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language ProcessingPages 1003–1010https://doi.org/10.3115/1220575.1220701This paper addresses a dialogue strategy to clarify and constrain the queries for speech-driven document retrieval systems. In spoken dialogue interfaces, users often make utterances before the query is completely generated in their mind; thus input ...
- ArticleOctober 2005
Flexible text segmentation with structured multilabel classification
HLT '05: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language ProcessingPages 987–994https://doi.org/10.3115/1220575.1220699Many language processing tasks can be reduced to breaking the text into segments with prescribed properties. Such tasks include sentence splitting, tokenization, named-entity extraction, and chunking. We present a new model of text segmentation based on ...
- ArticleOctober 2005
A generalized framework for revealing analogous themes across related topics
HLT '05: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language ProcessingPages 979–986https://doi.org/10.3115/1220575.1220698This work addresses the task of identifying thematic correspondences across sub-corpora focused on different topics. We introduce an unsupervised algorithmic framework based on distributional data clustering, which generalizes previous initial works on ...
- ArticleOctober 2005
An orthonormal basis for topic segmentation in tutorial dialogue
HLT '05: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language ProcessingPages 971–978https://doi.org/10.3115/1220575.1220697This paper explores the segmentation of tutorial dialogue into cohesive topics. A latent semantic space was created using conversations from human to human tutoring transcripts, allowing cohesion between utterances to be measured using vector ...
-
- ArticleOctober 2005
Learning a spelling error model from search query logs
HLT '05: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language ProcessingPages 955–962https://doi.org/10.3115/1220575.1220695Applying the noisy channel model to search query spelling correction requires an error model and a language model. Typically, the error model relies on a weighted string edit distance measure. The weights can be learned from pairs of misspelled words ...
- ArticleOctober 2005
Searching the audio notebook: keyword search in recorded conversations
HLT '05: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language ProcessingPages 947–954https://doi.org/10.3115/1220575.1220694MIT's Audio Notebook added great value to the note-taking process by retaining audio recordings, e.g. during lectures or interviews. The key was to provide users ways to quickly and easily access portions of interest in a recording. Several non-speech-...
- ArticleOctober 2005
Integrating linguistic knowledge in passage retrieval for question answering
HLT '05: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language ProcessingPages 939–946https://doi.org/10.3115/1220575.1220693In this paper we investigate the use of linguistic knowledge in passage retrieval as part of an open-domain question answering system. We use annotation produced by a deep syntactic dependency parser for Dutch, Alpino, to extract various kinds of ...
- ArticleOctober 2005
Multi-perspective question answering using the OpQA corpus
HLT '05: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language ProcessingPages 923–930https://doi.org/10.3115/1220575.1220691We investigate techniques to support the answering of opinion-based questions. We first present the OpQA corpus of opinion questions and answers. Using the corpus, we compare and contrast the properties of fact and opinion questions and answers. Based ...
- ArticleOctober 2005
Using random walks for question-focused sentence retrieval
HLT '05: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language ProcessingPages 915–922https://doi.org/10.3115/1220575.1220690We consider the problem of question-focused sentence retrieval from complex news articles describing multi-event stories published over time. Annotators generated a list of questions central to understanding each story in our corpus. Because of the ...
- ArticleOctober 2005
A semi-supervised feature clustering algorithm with application to word sense disambiguation
HLT '05: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language ProcessingPages 907–914https://doi.org/10.3115/1220575.1220689In this paper we investigate an application of feature clustering for word sense disambiguation, and propose a semisupervised feature clustering algorithm. Compared with other feature clustering methods (ex. supervised feature clustering), it can infer ...
- ArticleOctober 2005
Measuring the relative compositionality of verb-noun (V-N) collocations by integrating features
HLT '05: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language ProcessingPages 899–906https://doi.org/10.3115/1220575.1220688Measuring the relative compositionality of Multi-word Expressions (MWEs) is crucial to Natural Language Processing. Various collocation based measures have been proposed to compute the relative compositionality of MWEs. In this paper, we define novel ...
- ArticleOctober 2005
A semantic scattering model for the automatic interpretation of genitives
HLT '05: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language ProcessingPages 891–898https://doi.org/10.3115/1220575.1220687This paper addresses the automatic classification of the semantic relations expressed by the English genitives. A learning model is introduced based on the statistical analysis of the distribution of genitives' semantic relations on a large corpus. The ...
- ArticleOctober 2005
Exploiting a verb lexicon in automatic semantic role labelling
HLT '05: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language ProcessingPages 883–890https://doi.org/10.3115/1220575.1220686We develop an unsupervised semantic role labelling system that relies on the direct application of information in a predicate lexicon combined with a simple probability model. We demonstrate the usefulness of predicate lexicons for role labelling, as ...
- ArticleOctober 2005
Inducing a multilingual dictionary from a parallel multitext in related languages
HLT '05: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language ProcessingPages 875–882https://doi.org/10.3115/1220575.1220685Dictionaries and word translation models are used by a variety of systems, especially in machine translation. We build a multilingual dictionary induction system for a family of related resource-poor languages. We assume only the presence of a single ...
- ArticleOctober 2005
OCR post-processing for low density languages
HLT '05: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language ProcessingPages 867–874https://doi.org/10.3115/1220575.1220684We present a lexicon-free post-processing method for optical character recognition (OCR), implemented using weighted finite state machines. We evaluate the technique in a number of scenarios relevant for natural language processing, including creation ...
- ArticleOctober 2005
Cross-linguistic projection of role-semantic information
HLT '05: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language ProcessingPages 859–866https://doi.org/10.3115/1220575.1220683This paper considers the problem of automatically inducing role-semantic annotations in the FrameNet paradigm for new languages. We introduce a general framework for semantic projection which exploits parallel texts, is relatively inexpensive and can ...
- ArticleOctober 2005
A backoff model for bootstrapping resources for non-English languages
HLT '05: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language ProcessingPages 851–858https://doi.org/10.3115/1220575.1220682The lack of annotated data is an obstacle to the development of many natural language processing applications; the problem is especially severe when the data is non-English. Previous studies suggested the possibility of acquiring resources for non-...
- ArticleOctober 2005
Paradigmatic modifiability statistics for the extraction of complex multi-word terms
HLT '05: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language ProcessingPages 843–850https://doi.org/10.3115/1220575.1220681We here propose a new method which sets apart domain-specific terminology from common non-specific noun phrases. It is based on the observation that terminological multi-word groups reveal a considerably lesser degree of distributional variation than ...
- ArticleOctober 2005
Using the web as an implicit training set: application to structural ambiguity resolution
HLT '05: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language ProcessingPages 835–842https://doi.org/10.3115/1220575.1220680Recent work has shown that very large corpora can act as training data for NLP algorithms even without explicit labels. In this paper we show how the use of surface features and paraphrases in queries against search engines can be used to infer labels ...