Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.3115/1220175.1220190dlproceedingsArticle/Chapter ViewAbstractPublication PagesaclConference Proceedingsconference-collections
Article
Free access

Espresso: leveraging generic patterns for automatically harvesting semantic relations

Published: 17 July 2006 Publication History

Abstract

In this paper, we present Espresso, a weakly-supervised, general-purpose, and accurate algorithm for harvesting semantic relations. The main contributions are: i) a method for exploiting generic patterns by filtering incorrect instances using the Web; and ii) a principled measure of pattern and instance reliability enabling the filtering algorithm. We present an empirical comparison of Espresso with various state of the art systems, on different size and genre corpora, on extracting various general and specific relations. Experimental results show that our exploitation of generic patterns substantially increases system recall with small effect on overall precision.

References

[1]
Berland, M. and E. Charniak, 1999. Finding parts in very large corpora. In Proceedings of ACL-1999. pp. 57--64. College Park, MD.
[2]
Brown, T. L.; LeMay, H. E.; Bursten, B. E.; and Burdge, J. R. 2003. Chemistry: The Central Science, Ninth Edition. Prentice Hall.
[3]
Caraballo, S. 1999. Automatic acquisition of a hypernym-labeled noun hierarchy from text. In Proceedings of ACL-99. pp 120--126, Baltimore, MD.
[4]
Cover, T. M. and Thomas, J. A. 1991. Elements of Information Theory. John Wiley & Sons.
[5]
Day, D.; Aberdeen, J.; Hirschman, L.; Kozierok, R.; Robinson, P.; and Vilain, M. 1997. Mixed-initiative development of language processing systems. In Proceedings of ANLP-97. Washington D.C.
[6]
Downey, D.; Etzioni, O.; and Soderland, S. 2005. A Probabilistic model of redundancy in information extraction. In Proceedings of IJCAI-05. pp. 1034--1041. Edinburgh, Scotland.
[7]
Etzioni, O.; Cafarella, M. J.; Downey, D.; Popescu, A.-M.; Shaked, T.; Soderland, S.; Weld, D. S.; and Yates, A. 2005. Unsupervised named-entity extraction from the Web: An experimental study. Artificial Intelligence, 165(1): 91--134.
[8]
Fellbaum, C. 1998. WordNet: An Electronic Lexical Database. MIT Press.
[9]
Geffet, M. and Dagan, I. 2005. The Distributional Inclusion Hypotheses and Lexical Entailment. In Proceedings of ACL-2005. Ann Arbor, MI.
[10]
Girju, R.; Badulescu, A.; and Moldovan, D. 2006. Automatic Discovery of Part-Whole Relations. Computational Linguistics, 32(1): 83--135.
[11]
Hearst, M. 1992. Automatic acquisition of hyponyms from large text corpora. In Proceedings of COLING-92. pp. 539--545. Nantes, France.
[12]
Hindle, D. 1990. Noun classification from predicate-argument structures. In Proceedings of ACL-90. pp. 268--275. Pittsburgh, PA.
[13]
Justeson J. S. and Katz S. M. 1995. Technical Terminology: some linguistic properties and algorithms for identification in text. In Proceedings of ICCL-95. pp.539--545. Nantes, France.
[14]
Lin, C.-Y. and Hovy, E. H. 2000. The Automated acquisition of topic signatures for text summarization. In Proceedings of COLING-00. pp. 495--501. Saarbrücken, Germany.
[15]
Lin, D. and Pantel, P. 2002. Concept discovery from text. In Proceedings of COLING-02. pp. 577--583. Taipei, Taiwan.
[16]
Mann, G. S. 2002. Fine-Grained Proper Noun Ontologies for Question Answering. In Proceedings of SemaNet' 02: Building and Using Semantic Networks, Taipei, Taiwan.
[17]
Pantel, P. and Ravichandran, D. 2004. Automatically labeling semantic classes. In Proceedings of HLT/NAACL-04. pp. 321--328. Boston, MA.
[18]
Pantel, P.; Ravichandran, D.; Hovy, E. H. 2004. Towards terascale knowledge acquisition. In Proceedings of COLING-04. pp. 771--777. Geneva, Switzerland.
[19]
Pasca, M. and Harabagiu, S. 2001. The informative role of WordNet in Open-Domain Question Answering. In Proceedings of NAACL-01 Workshop on WordNet and Other Lexical Resources. pp. 138--143. Pittsburgh, PA.
[20]
Ravichandran, D. and Hovy, E. H. 2002. Learning surface text patterns for a question answering system. In Proceedings of ACL-2002. pp. 41--47. Philadelphia, PA.
[21]
Riloff, E. and Shepherd, J. 1997. A corpus-based approach for building semantic lexicons. In Proceedings of EMNLP-97.
[22]
Siegel, S. and Castellan Jr., N. J. 1988. Nonparametric Statistics for the Behavioral Sciences. McGraw-Hill.
[23]
Szpektor, I.; Tanev, H.; Dagan, I.; and Coppola, B. 2004. Scaling web-based acquisition of entailment relations. In Proceedings of EMNLP-04. Barcelona, Spain.

Cited By

View all
  • (2021)PolyU-CBS at the FinSim-2 Task: Combining Distributional, String-Based and Transformers-Based Features for Hypernymy Detection in the Financial DomainCompanion Proceedings of the Web Conference 202110.1145/3442442.3451387(316-319)Online publication date: 19-Apr-2021
  • (2021)Advanced Semantics for Commonsense Knowledge ExtractionProceedings of the Web Conference 202110.1145/3442381.3449827(2636-2647)Online publication date: 19-Apr-2021
  • (2020)CauseNetProceedings of the 29th ACM International Conference on Information & Knowledge Management10.1145/3340531.3412763(3023-3030)Online publication date: 19-Oct-2020
  • Show More Cited By
  1. Espresso: leveraging generic patterns for automatically harvesting semantic relations

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image DL Hosted proceedings
      ACL-44: Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
      July 2006
      1214 pages

      Publisher

      Association for Computational Linguistics

      United States

      Publication History

      Published: 17 July 2006

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate 85 of 443 submissions, 19%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)82
      • Downloads (Last 6 weeks)18
      Reflects downloads up to 16 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2021)PolyU-CBS at the FinSim-2 Task: Combining Distributional, String-Based and Transformers-Based Features for Hypernymy Detection in the Financial DomainCompanion Proceedings of the Web Conference 202110.1145/3442442.3451387(316-319)Online publication date: 19-Apr-2021
      • (2021)Advanced Semantics for Commonsense Knowledge ExtractionProceedings of the Web Conference 202110.1145/3442381.3449827(2636-2647)Online publication date: 19-Apr-2021
      • (2020)CauseNetProceedings of the 29th ACM International Conference on Information & Knowledge Management10.1145/3340531.3412763(3023-3030)Online publication date: 19-Oct-2020
      • (2019)Extracting common sense knowledge via triple ranking using supervised and unsupervised distributional modelsSemantic Web10.3233/SW-18030210:1(139-158)Online publication date: 1-Jan-2019
      • (2019)OspreyProceedings of the 3rd International Workshop on Data Management for End-to-End Machine Learning10.1145/3329486.3329492(1-11)Online publication date: 30-Jun-2019
      • (2019)Conceptual Representations for Computational Concept CreationACM Computing Surveys10.1145/318672952:1(1-33)Online publication date: 25-Feb-2019
      • (2019)SGUARDProceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering10.1109/ASE.2019.00122(1142-1145)Online publication date: 10-Nov-2019
      • (2019)A cascaded framework for identification and extraction of antonym for Turkish languageSoft Computing - A Fusion of Foundations, Methodologies and Applications10.1007/s00500-018-3417-123:17(7853-7864)Online publication date: 1-Sep-2019
      • (2018)Learning Multimodal Taxonomy via Variational Deep Graph Embedding and ClusteringProceedings of the 26th ACM international conference on Multimedia10.1145/3240508.3240586(681-689)Online publication date: 15-Oct-2018
      • (2018)Classification and Feature Extraction for Text-based Drug Incident ReportProceedings of the 2018 6th International Conference on Bioinformatics and Computational Biology10.1145/3194480.3194499(145-149)Online publication date: 12-Mar-2018
      • Show More Cited By

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media