Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleMay 2009
Tightly coupling speech recognition and search
In this paper, we discuss the benefits of tightly coupling speech recognition and search components in the context of a speech-driven search application. We demonstrate that by incorporating constraints from the information repository that is being ...
- research-articleMay 2009
Fast decoding for open vocabulary spoken term detection
Information retrieval and spoken-term detection from audio such as broadcast news, telephone conversations, conference calls, and meetings are of great interest to the academic, government, and business communities. Motivated by the requirement for high-...
- research-articleMay 2009
Automatic Chinese abbreviation generation using conditional random field
This paper presents a new method for automatically generating abbreviations for Chinese organization names. Abbreviations are commonly used in spoken Chinese, especially for organization names. The generation of Chinese abbreviation is much more complex ...
- research-articleMay 2009
Score distribution based term specific thresholding for spoken term detection
The spoken term detection (STD) task aims to return relevant segments from a spoken archive that contain the query terms. This paper focuses on the decision stage of an STD system. We propose a term specific thresholding (TST) method that uses per query ...
- research-articleMay 2009
Anchored speech recognition for question answering
In this paper, we propose a novel question answering system that searches for responses from spoken documents such as broadcast news stories and conversations. We propose a novel two-step approach, which we refer to as anchored speech recognition, to ...
-
- research-articleMay 2009
Reverse revision and linear tree combination for dependency parsing
Deterministic transition-based Shift/Reduce dependency parsers make often mistakes in the analysis of long span dependencies (McDonald & Nivre, 2007).
- research-articleMay 2009
Recognising the predicate-argument structure of Tagalog
This paper describes research on parsing Tagalog text for predicate-argument structure (PAS). We first outline the linguistic phenomenon and corpus annotation process, then detail a series of PAS parsing experiments.
- research-articleMay 2009
Combining constituent parsers
Combining the 1-best output of multiple parsers via parse selection or parse hybridization improves f-score over the best individual parser (Henderson and Brill, 1999; Sagae and Lavie, 2006). We propose three ways to improve upon existing methods for ...
- research-articleMay 2009
Active Zipfian sampling for statistical parser training
Active learning has proven to be a successful strategy in quick development of corpora to be used in training of statistical natural language parsers. A vast majority of studies in this field has focused on estimating informativeness of samples; however,...
- research-articleMay 2009
Quadratic features and deep architectures for chunking
We experiment with several chunking models. Deeper architectures achieve better generalization. Quadratic filters, a simplification of a theoretical model of V1 complex cells, reliably increase accuracy. In fact, logistic regression with quadratic ...
- research-articleMay 2009
Sentence boundary detection and the problem with the U.S.
Sentence Boundary Detection is widely used but often with outdated tools. We discuss what makes it difficult, which features are relevant, and present a fully statistical system, now publicly available, that gives the best known error rate on a standard ...
- research-articleMay 2009
Semantic classification with WordNet kernels
This paper presents methods for performing graph-based semantic classification using kernel functions defined on the WordNet lexical hierarchy. These functions are evaluated on the SemEval Task 4 relation classification dataset and their performance is ...
- research-articleMay 2009
Estimating and exploiting the entropy of sense distributions
Word sense distributions are usually skewed. Predicting the extent of the skew can help a word sense disambiguation (WSD) system determine whether to consider evidence from the local context or apply the simple yet effective heuristic of using the first ...
- research-articleMay 2009
Determining the position of adverbial phrases in English
In this paper we compare three approaches to adverbial positioning using lexical, syntactic, semantic and sentence-level features. We find that: (a), one- and two-stage classification-based approaches can achieve almost 86% accuracy in determining the ...
- research-articleMay 2009
Tree linearization in English: improving language model based approaches
We compare two approaches to dependency tree linearization, a task which arises in many NLP applications. The first one is the widely used 'overgenerate and rank' approach which relies exclusively on a trigram language model (LM); the second one ...
- research-articleMay 2009
On the importance of pivot language selection for statistical machine translation
Recent research on multilingual statistical machine translation focuses on the usage of pivot languages in order to overcome resource limitations for certain language pairs. Due to the richness of available language resources, English is in general the ...
- research-articleMay 2009
Statistical post-editing of a rule-based machine translation system
Automatic post-editing (APE) systems aim at correcting the output of machine translation systems to produce better quality translations, i.e. produce translations can be manually post-edited with an increase in productivity. In this work, we present an ...
- research-articleMay 2009
Improving a simple bigram HMM part-of-speech tagger by latent annotation and self-training
In this paper, we describe and evaluate a bigram part-of-speech (POS) tagger that uses latent annotations and then investigate using additional genre-matched unlabeled data for self-training the tagger. The use of latent annotations substantially ...
- research-articleMay 2009
Language specific issue and feature exploration in Chinese event extraction
In this paper, we present a Chinese event extraction system. We point out a language specific issue in Chinese trigger labeling, and then commit to discussing the contributions of lexical, syntactic and semantic features applied in trigger labeling and ...
- research-articleMay 2009
Using N-gram based features for machine translation system combination
Conventional confusion network based system combination for machine translation (MT) heavily relies on features that are based on the measure of agreement of words in different translation hypotheses. This paper presents two new features that consider ...