Automatic construction of a hypernym-labeled noun hierarchy from text

January 2001

Author:
Sharon Ann Caraballo,
Adviser:
Eugene Charniak

Publisher:

Brown University
Department of Computer Science Box 1910 Providence, RI
United States

ISBN:978-0-493-15995-9

Order Number:AAI3006696

Pages:

Purchase on ProQuest

Bibliometrics

Abstract

Many language processing tasks are dependent on large databases of lexical semantic information, such as WordNet. These hand-built resources are tremendously time-consuming to create and may be lacking in coverage. They may be particularly inappropriate for text from a single domain, both because domain-specific terms are missing and because the lexicon contains many words or meanings which would be extremely rare in that domain. This thesis describes statistical techniques to automatically extract semantic information about words from text; specifically, given a large corpus of text and no additional sources of semantic information, we build a hierarchy of nouns appearing in the text. The hierarchy is in the form of an IS-A tree, where the nodes of the tree contain one or more nouns, and the ancestors of a node contain hypernyms of the nouns in that node. (An English word A is said to be a hypernym of a word B if native speakers of English accept the sentence “B is a (kind of) A.”) The techniques presented here could be used in the construction of updated or domain-specific semantic resources as needed. The methods described here provide a substantial improvement over previously published results; while we could previously produce a hierarchy whose internal nodes were judged to be correct hypernyms for 33% of the nouns beneath them, we can now achieve 56% on this measure. The thesis also includes a detailed discussion of a particular subproblem: determining which of a pair of nouns is more specific. We identify numerical measures which can be easily computed from a text corpus and which can answer this question with over 80% accuracy.

Cited By

Contributors

Eugene Charniak
Brown University
- Publication Years1968 - 2019
- Publication counts110
- Citation count2,604
- Available for Download47
- Downloads (cumulative)22,998
- Downloads (12 months)1,922
- Downloads (6 weeks)363
- Average Downloads per Article489
- Average Citation per Article24
View Full Profile
Sharon Ann Caraballo
Brown University
- Publication Years1996 - 2001
- Publication counts4
- Citation count147
- Available for Download2
- Downloads (cumulative)1,383
- Downloads (12 months)80
- Downloads (6 weeks)16
- Average Downloads per Article692
- Average Citation per Article37
View Full Profile

Comments

Recommendations

Automatic construction of a hypernym-labeled noun hierarchy from text
ACL '99: Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics

Previous work has shown that automatic methods can be used in building semantic lexicons. This work goes a step further by automatically creating not just clusters of related words, but a hierarchy of nouns and their hypernyms, akin to the hand-built ...
The Automatic Construction Method of Mongolian WordNet Noun Sets of Synonyms
ICINIS '11: Proceedings of the 2011 4th International Conference on Intelligent Networks and Intelligent Systems

Automatic construction of Mongolian noun sets of synonyms is the fundamental work to be accomplished first when developing the noun subnet of Mongolian Word Net. This article proposed an approach of transforming Chinese or English Word Net to Mongolian ...
Automatic Persian WordNet construction
COLING '10: Proceedings of the 23rd International Conference on Computational Linguistics: Posters

In this paper, an automatic method for Persian WordNet construction based on Prenceton WordNet 2.1 (PWN) is introduced. The proposed approach uses Persian and English corpora as well as a bilingual dictionary in order to make a mapping between PWN ...

Browse Theses

Sections

Cited By