Trainable english generation with modifier and adjunct ordering

January 2009

Author:
Huayan Zhong
State University of New York at Stony Brook
,
Adviser:
Amanda Stent
State University of New York at Stony Brook

Publisher:

State University of New York at Stony Brook
Stony Brook, NY
United States

ISBN:978-1-109-73874-2

Order Number:AAI3406711

Pages:

212

Purchase on ProQuest

Bibliometrics

Abstract

Natural language generation involves the automatic formulation of natural language sentences. The ultimate goal in NLG is for the computer to produce language that is as meaningful, fluent and idiomatic as that produced by humans. A typical NLG system will include components for selecting and structuring the content to be generated (content planning), assigning content units to sentences (sentence planning) and realizing each content unit in a particular language (surface realization). Although there are surface realizers that can produce fluent output, very little research has been done on adjunct ordering. Adjuncts are either ordered in the surface realizer's input, or all possibilities are generated and the alternatives are ranked using a language model.

In this thesis, I address modifier and adjunct ordering in trainable surface realization for English. First, I describe a chart-based surface realizer I have implemented that uses a probabilistic feature-based tree adjoining grammar extracted automatically from a training corpus. My surface realizer takes input logical forms and performs insertion of function words, word and constituent ordering, and morphological inflection. Its performance is comparable to that of other trainable surface realizers that take the same type of input.

Second, I present a set of experiments in which I compare different approaches to word and constituent ordering for several tasks: the dative and benefactive alternations, ordering of prenominal adjectives, ordering of adverbials, ordering of prepositional phrases, and ordering of adjuncts in general. I compare a classifier-based approach with rich feature sets to two simple relative frequency approaches (one using head words as the only the feature, the other using part-of-speech tags as the only feature). I experimented with lexical, syntactic, semantic, pragmatic, sentence and frequency features in my classifier-based approach. I present evaluations of these approaches in terms of classification accuracy and improvement in performance of the surface realizer. I show that the classification-based approach to word and adjunct ordering improves on the simple relative frequency based approach. I also show that different feature sets are useful for different tasks and genres, with (in general) pragmatic features being more important for dialog and syntactic features for text.

Finally, I describe a human evaluation of my initial surface realizer with relative frequency approaches to ordering of adjuncts and modifiers, and of my modified surface realizer which uses a classification-based approach to ordering of adjuncts and modifiers. I show that modeling of adjunct and modifier ordering can lead to small but significant improvements in surface realization performance, and analyze the types of errors identified by the human evaluators.

Contributors

Amanda Joy Stent
AT&T Laboratories Florham Park
- Publication Years1997 - 2013
- Publication counts52
- Citation count596
- Available for Download34
- Downloads (cumulative)10,146
- Downloads (12 months)602
- Downloads (6 weeks)148
- Average Downloads per Article298
- Average Citation per Article11
View Full Profile
Huayan Zhong
Stony Brook University
- Publication Years2009 - 2009
- Publication counts2
- Citation count1
- Available for Download1
- Downloads (cumulative)116
- Downloads (12 months)7
- Downloads (6 weeks)3
- Average Downloads per Article116
- Average Citation per Article1
View Full Profile

Comments

Recommendations

Post-Ordering by Parsing with ITG for Japanese-English Statistical Machine Translation

Word reordering is a difficult task for translation between languages with widely different word orders, such as Japanese and English. A previously proposed post-ordering method for Japanese-to-English translation first translates a Japanese sentence ...
Syntax-Based Post-Ordering for Efficient Japanese-to-English Translation

This article proposes a novel reordering method for efficient two-step Japanese-to-English statistical machine translation (SMT) that isolates reordering from SMT and solves it after lexical translation. This reordering problem, called post-ordering, is ...
Post-ordering by parsing for Japanese-English statistical machine translation
ACL '12: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2

Reordering is a difficult task in translating between widely different languages such as Japanese and English. We employ the post-ordering framework proposed by (Sudoh et al., 2011b) for Japanese to English translation and improve upon the reordering ...

Browse Theses

Sections

Post-Ordering by Parsing with ITG for Japanese-English Statistical Machine Translation

Syntax-Based Post-Ordering for Efficient Japanese-to-English Translation

Post-ordering by parsing for Japanese-English statistical machine translation

Sections

Save to Binder

Recommendations

Post-Ordering by Parsing with ITG for Japanese-English Statistical Machine Translation

Syntax-Based Post-Ordering for Efficient Japanese-to-English Translation

Post-ordering by parsing for Japanese-English statistical machine translation