Automatic detection of prosodic constituents for parsing

January 1992

Author:
Colin Wills Wightman
Boston Univ., Boston, MA

Publisher:

Boston University
Office of Information Technology 111 Cummington St. Boston, MA
United States

Order Number:UMI Order No. GAX92-00939

Bibliometrics

Abstract

This dissertation describes research directed towards increasing the accuracy and processing speed of spoken language systems by developing methods to utilize prosody. Specifically, algorithms to automatically label prosodic phrasal structure are developed and a method of using this information to reduce syntactic ambiguity is investigated. The prosodic phrase structure is represented by a set of seven prosodic "break indices" motivated by linguistic theory. Three principal results are discussed: a quantitative examination of segmental lengthening near prosodic boundaries, an automatic algorithm for labeling prosodic boundaries, and a parse scoring mechanism.

A measure of segmental lengthening is developed and applied to speech with hand-labeled break indices to study the relationship between lengthening and perceived boundary size. Lengthening near phrasal boundaries is found to be restricted to the rhyme of the final syllable prior to the boundary. Furthermore, at least four levels of phrasal structure can be differentiated on the basis of this lengthening. To detect the phrasal boundaries, a speech recognizer and the sentence transcription are used to obtain a phoneme segmentation, and a vector of features is generated for each word boundary. These features are motivated both by the lengthening results and by linguistic theory. The vectors are quantized via a binary tree and a Hidden Markov Model is used to recover the sequence of boundaries (break indices) most likely to have produced the sequence of feature vectors observed. Break indices generated by this algorithm are highly correlated with those made by trained human listeners. A method is developed, by which the labels can be used in a speech understanding system to help identify the speaker's intended meaning. The approach taken here is to score a parse using analysis-by-synthesis. Based on a corpus of speech read by professional radio announcers, experimental results indicate that this method can achieve performance comparable to that of human listeners.

By integrating statistical methods and linguistic theory, an algorithm has been developed which can reduce the syntactic ambiguity encountered in spoken language systems. In addition, automatic labeling provides a powerful new tool for the study of prosody.

Cited By

Xie H, Andreae P, Zhang M and Warren P Detecting stress in spoken English using Decision Trees and Support Vector Machines Proceedings of the second workshop on Australasian information security, Data Mining and Web Intelligence, and Software Internationalisation - Volume 32, (145-150)

Contributors

Colin Wills Wightman
Boston University
- Publication Years1989 - 1992
- Publication counts6
- Citation count15
- Available for Download1
- Downloads (cumulative)283
- Downloads (12 months)16
- Downloads (6 weeks)4
- Average Downloads per Article283
- Average Citation per Article3
View Full Profile

Index Terms

Automatic detection of prosodic constituents for parsing
1. Applied computing
  1. Arts and humanities
    1. Language translation
2. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing

Comments

Recommendations

Automatic Prosodic Event Detection Using Acoustic, Lexical, and Syntactic Evidence

With the advent of prosody annotation standards such as tones and break indices (ToBI), speech technologists and linguists alike have been interested in automatically detecting prosodic events in speech. This is because the prosodic tier provides an ...
Prosodic disambiguation in automatic speech understanding of Thai
Automatic Detection of the Prosodic Structures of Speech Utterances
SPECOM 2013: Proceedings of the 15th International Conference on Speech and Computer - Volume 8113

This paper presents an automatic approach for the detection of the prosodic structures of speech utterances. The algorithm relies on a hierarchical representation of the prosodic organization of the speech utterances. The approach is applied on a corpus ...

Browse Theses

Sections

Cited By

Index Terms

Automatic Prosodic Event Detection Using Acoustic, Lexical, and Syntactic Evidence

Prosodic disambiguation in automatic speech understanding of Thai

Automatic Detection of the Prosodic Structures of Speech Utterances

Sections

Cited By

Save to Binder

Index Terms

Recommendations

Automatic Prosodic Event Detection Using Acoustic, Lexical, and Syntactic Evidence

Prosodic disambiguation in automatic speech understanding of Thai

Automatic Detection of the Prosodic Structures of Speech Utterances