Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.3115/1119176.1119191dlproceedingsArticle/Chapter ViewAbstractPublication PagesconllConference Proceedingsconference-collections
Article
Free access

Using LSA and noun coordination information to improve the precision and recall of automatic hyponymy extraction

Published: 31 May 2003 Publication History

Abstract

In this paper we demonstrate methods of improving both the recall and the precision of automatic methods for extraction of hyponymy (IS_A) relations from free text. By applying latent semantic analysis (LSA) to filter extracted hyponymy relations we reduce the rate of error of our initial pattern-based hyponymy extraction by 30%, achieving precision of 58%. Applying a graph-based model of noun-noun similarity learned automatically from coordination patterns to previously extracted correct hyponymy relations, we achieve roughly a five-fold increase in the number of correct hyponymy relations extracted.

References

[1]
Enrique Alfonseca and Suresh Manandhar. 2001. Improving an ontology refinement method with hyponymy patterns. In Third International Conference on Language Resources and Evaluation, pages 235--239, Las Palmas, Spain.
[2]
Ricardo Baeza-Yates and Berthier Ribiero-Neto. 1999. Modern Information Retrieval. Addison Wesley / ACM Press.
[3]
Béla Bollobás. 1998. Modern Graph Theory. Number 184 in Graduate Texts in Mathematics. Springer-Verlag.
[4]
Sharon Caraballo. 1999. Automatic construction of a hypernym-labeled noun hierarchy from text. In 37th Annual Meeting of the Association for Computational Linguistics: Proceedings of the Conference, pages 120--126.
[5]
Claire Cardie. 1997. Empirical methods in information extraction. AI Magazine, 18:65--79.
[6]
Scott Deerwester, Susan Dumais, George Furnas, Thomas Landauer, and Richard Harshman. 1990. Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41(6):391--407.
[7]
Christiane Fellbaum, editor. 1998. WordNet: An Electronic Lexical Database. MIT Press, Cambridge MA.
[8]
L Guthrie, J Pustejovsky, Y Wilks, and B Slator. 1996. The role of lexicons in natural language processing. Communications of the ACM, 39(1):63--72.
[9]
Udo Hahn and Klemens Schnattinger. 1998. Towards text knowledge engineering. In AAAI/IAAI, pages 524--531.
[10]
Marti A. Hearst. 1992. Automatic acquisition of hyponyms from large text corpora. In COLING, Nantes, France.
[11]
Marti A. Hearst, 1998. WordNet: An Electronic Lexical Database, chapter 5, Automated discovery of WordNet relations, pages 131--152. MIT Press, Cambridge MA.
[12]
Christopher D. Manning and Hinrich Schütze. 1999. Foundations of Statistical Natural Language Processing. The MIT Press, Cambridge, Massachusetts.
[13]
Guido Minnen, John Carroll, and Darren Pearce. 2001. Applied morphological processing of english. Natural Language Engineering, 7(3):207--223.
[14]
Ellen Riloff and Rosie Jones. 1999. Learning dictionaries for infomation extraction by multi-level bootstrapping. In Proceedings of the Sixteenth National Conference on Artificial Intelligence, pages 472--479. AAAI.
[15]
Ellen Riloff and Jessica Shepherd. 1997. A corpus-based approach for building semantic lexicons. In Claire Cardie and Ralph Weischedel, editors, Proceedings of the Second Conference on Empirical Methods in Natural Language Processing, pages 117--124. Association for Computational Linguistics, Somerset, New Jersey.
[16]
Brian Roark and Eugene Charniak. 1998. Noun-phrase co-occurence statistics for semi-automatic semantic lexicon construction. In COLING-ACL, pages 1110--1116.
[17]
Hinrich Schütze. 1998. Automatic word sense discrimination. Computational Linguistics, 24(1):97--124.
[18]
Dominic Widdows and Beate Dorow. 2002. A graph model for unsupervised lexical acquisition. In 19th International Conference on Computational Linguistics, pages 1093--1099, Taipei, Taiwan, August.

Cited By

View all
  • (2017)Automatic Knowledge Base Construction from Scholarly DocumentsProceedings of the 2017 ACM Symposium on Document Engineering10.1145/3103010.3121043(149-152)Online publication date: 31-Aug-2017
  • (2016)Semantic Querying based Concept Hierarchy Construction for Ontology LearningProceedings of the International Conference on Informatics and Analytics10.1145/2980258.2980307(1-6)Online publication date: 25-Aug-2016
  • (2015)Unsupervised learning of an IS-A taxonomy from a limited domain-specific corpusProceedings of the 24th International Conference on Artificial Intelligence10.5555/2832415.2832449(1434-1441)Online publication date: 25-Jul-2015
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image DL Hosted proceedings
CONLL '03: Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
May 2003
213 pages

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 31 May 2003

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)42
  • Downloads (Last 6 weeks)9
Reflects downloads up to 10 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2017)Automatic Knowledge Base Construction from Scholarly DocumentsProceedings of the 2017 ACM Symposium on Document Engineering10.1145/3103010.3121043(149-152)Online publication date: 31-Aug-2017
  • (2016)Semantic Querying based Concept Hierarchy Construction for Ontology LearningProceedings of the International Conference on Informatics and Analytics10.1145/2980258.2980307(1-6)Online publication date: 25-Aug-2016
  • (2015)Unsupervised learning of an IS-A taxonomy from a limited domain-specific corpusProceedings of the 24th International Conference on Artificial Intelligence10.5555/2832415.2832449(1434-1441)Online publication date: 25-Jul-2015
  • (2013)Automatic Information Extraction from Texts with Inference and Linguistic Knowledge Acquisition RulesProceedings of the 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) - Volume 0310.1109/WI-IAT.2013.171(151-154)Online publication date: 17-Nov-2013
  • (2012)A study of hybrid similarity measures for semantic relation extractionProceedings of the Workshop on Innovative Hybrid Approaches to the Processing of Textual Data10.5555/2388632.2388634(10-18)Online publication date: 23-Apr-2012
  • (2012)An evaluation of corpus-driven measures of medical concept similarity for information retrievalProceedings of the 21st ACM international conference on Information and knowledge management10.1145/2396761.2398661(2439-2442)Online publication date: 29-Oct-2012
  • (2012)Corpus-Driven hyponym acquisition for turkish languageProceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I10.1007/978-3-642-28604-9_3(29-41)Online publication date: 11-Mar-2012
  • (2011)Using the web to validate lexico-semantic relationsProceedings of the 15th Portugese conference on Progress in artificial intelligence10.5555/2051115.2051170(597-609)Online publication date: 10-Oct-2011
  • (2011)Ontology development for health care in IndiaProceedings of the International Conference & Workshop on Emerging Trends in Technology10.1145/1980022.1980176(715-718)Online publication date: 25-Feb-2011
  • (2010)Web mining for event-based commonsense knowledge using lexico-syntactic pattern matching and semantic role labelingExpert Systems with Applications: An International Journal10.1016/j.eswa.2009.05.06037:1(341-347)Online publication date: 1-Jan-2010
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media