Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1242572.1242667acmconferencesArticle/Chapter ViewAbstractPublication PageswebconfConference Proceedingsconference-collections
Article

Yago: a core of semantic knowledge

Published: 08 May 2007 Publication History
  • Get Citation Alerts
  • Abstract

    We present YAGO, a light-weight and extensible ontology with high coverage and quality. YAGO builds on entities and relations and currently contains more than 1 million entities and 5 million facts. This includes the Is-A hierarchy as well as non-taxonomic relations between entities (such as HASONEPRIZE). The facts have been automatically extracted from Wikipedia and unified with WordNet, using a carefully designed combination of rule-based and heuristic methods described in this paper. The resulting knowledge base is a major step beyond WordNet: in quality by adding knowledge about individuals like persons, organizations, products, etc. with their semantic relationships - and in quantity by increasing the number of facts by more than an order of magnitude. Our empirical evaluation of fact correctness shows an accuracy of about 95%. YAGO is based on a logically clean model, which is decidable, extensible, and compatible with RDFS. Finally, we show how YAGO can be further extended by state-of-the-art information extraction techniques.

    References

    [1]
    E. Agichtein and L. Gravano. Snowball: extracting relations from large plain-text collections. In ICDL, 2000.
    [2]
    F. Baader and T. Nipkow. Term rewriting and all that. Cambridge University Press, New York, NY, USA, 1998.
    [3]
    R. C. Bunescu and M. Pasca. Using encyclopedic knowledge for named entity disambiguation. In EACL, 2006.
    [4]
    M. J. Cafarella, D. Downey, S. Soderland, and O. Etzioni. KnowItNow: Fast, scalable information extraction from the web. In EMNLP, 2005.
    [5]
    N. Chatterjee, S. Goyal, and A. Naithani. Resolving pattern ambiguity for english to hindi machine translation using WordNet. In Workshop on Modern Approaches in Translation Technologies, 2005.
    [6]
    S. Chaudhuri, V. Ganti, and R. Motwani. Robust identification of fuzzy duplicates. In ICDE, 2005.
    [7]
    W. W. Cohen and S. Sarawagi. Exploiting dictionaries in named entity extraction: combining semi-markov extraction processes and data integration methods. In KDD, 2004.
    [8]
    H. Cunningham, D. Maynard, K. Bontcheva, and V. Tablan. GATE: A framework and graphical development environment for robust NLP tools and applications. In ACL, 2002.
    [9]
    O. Etzioni, M. J. Cafarella, D. Downey, S. Kok, A. -M. Popescu, T. Shaked, S. Soderland, D. S. Weld, and A. Yates. Web-scale information extraction in KnowItAll. In WWW, 2004.
    [10]
    C. Fellbaum, editor. WordNet: An Electronic Lexical Database. MIT Press, 1998.
    [11]
    J. Graupmann, R. Schenkel, and G. Weikum. The spheresearch engine for unified ranked retrieval of heterogeneous XML and web documents. In VLDB, 2005.
    [12]
    I. Horrocks, O. Kutz, and U. Sattler. The even more irresistible SROIQ. In KR, 2006.
    [13]
    W. Hunt, L. Lita, and E. Nyberg. Gazetteers, wordnet, encyclopedias, and the web: Analyzing question answering resources. Technical Report CMU-LTI-04-188, Language Technologies Institute, Carnegie Mellon, 2004.
    [14]
    G. Ifrim and G. Weikum. Transductive learning for text classification using explicit knowledge models. In PKDD, 2006.
    [15]
    D. Kinzler. WikiSense - Mining the Wiki. In Wikimania, 2005.
    [16]
    S. Liu, F. Liu, C. Yu, and W. Meng. An effective approach to document retrieval via utilizing wordnet and recognizing phrases. In SIGIR, 2004.
    [17]
    C. Matuszek, J. Cabral, M. Witbrock, and J. DeOliveira. An introduction to the syntax and content of Cyc. In AAAI Spring Symposium, 2006.
    [18]
    I. Niles and A. Pease. Towards a standard upper ontology. In FOIS, 2001.
    [19]
    N. F. Noy, A. Doan, and A. Y. Halevy. Semantic integration. AI Magazine, 26(1):7--10, 2005.
    [20]
    P. Pantel and M. Pennacchiotti. Espresso: Leveraging generic patterns for automatically harvesting semantic relations. In ACL, 2006.
    [21]
    M. Ruiz-Casado, E. Alfonseca, and P. Castells. Automatic extraction of semantic relationships for WordNet by means of pattern learning from Wikipedia. In NLDB, pages 67--79, 2006.
    [22]
    S. Russell and P. Norvig. Artificial Intelligence: a Modern Approach. Prentice Hall, 2002.
    [23]
    R. Snow, D. Jurafsky, and A. Y. Ng. Semantic taxonomy induction from heterogenous evidence. In ACL, 2006.
    [24]
    S. Staab and R. Studer. Handbook on Ontologies. Springer, 2004.
    [25]
    F. M. Suchanek, G. Ifrim, and G. Weikum. Combining linguistic and statistical analysis to extract relations from web documents. In KDD, 2006.
    [26]
    F. M. Suchanek, G. Ifrim, and G. Weikum. LEILA: Learning to Extract Information by Linguistic Analysis. In Workshop on Ontology Population at ACL/COLING, 2006.
    [27]
    M. Theobald, R. Schenkel, and G. Weikum. TopX and XXL at INEX 2005. In INEX, 2005.
    [28]
    W3C. Sparql, 2005. retrieved from http://www.w3.org/TR/rdf-sparql-query/.

    Cited By

    View all
    • (2024)A knowledge graph embedding model based attention mechanism for enhanced node information integrationPeerJ Computer Science10.7717/peerj-cs.180810(e1808)Online publication date: 22-Jan-2024
    • (2024)CR‐M‐SpanBERT: Multiple embedding‐based DNN coreference resolution using self‐attention SpanBERTETRI Journal10.4218/etrij.2023-030846:1(35-47)Online publication date: 28-Feb-2024
    • (2024)Generating Spatial Knowledge Graphs with 2D Indoor Floorplan Data: A Case Study on the Jeonju Express Bus TerminalISPRS International Journal of Geo-Information10.3390/ijgi1302005213:2(52)Online publication date: 9-Feb-2024
    • Show More Cited By

    Index Terms

    1. Yago: a core of semantic knowledge

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      WWW '07: Proceedings of the 16th international conference on World Wide Web
      May 2007
      1382 pages
      ISBN:9781595936547
      DOI:10.1145/1242572
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 08 May 2007

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. WordNet
      2. wikipedia

      Qualifiers

      • Article

      Conference

      WWW'07
      Sponsor:
      WWW'07: 16th International World Wide Web Conference
      May 8 - 12, 2007
      Alberta, Banff, Canada

      Acceptance Rates

      Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)303
      • Downloads (Last 6 weeks)26

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)A knowledge graph embedding model based attention mechanism for enhanced node information integrationPeerJ Computer Science10.7717/peerj-cs.180810(e1808)Online publication date: 22-Jan-2024
      • (2024)CR‐M‐SpanBERT: Multiple embedding‐based DNN coreference resolution using self‐attention SpanBERTETRI Journal10.4218/etrij.2023-030846:1(35-47)Online publication date: 28-Feb-2024
      • (2024)Generating Spatial Knowledge Graphs with 2D Indoor Floorplan Data: A Case Study on the Jeonju Express Bus TerminalISPRS International Journal of Geo-Information10.3390/ijgi1302005213:2(52)Online publication date: 9-Feb-2024
      • (2024)Hierarchical Perceptual Graph Attention Network for Knowledge Graph CompletionElectronics10.3390/electronics1304072113:4(721)Online publication date: 9-Feb-2024
      • (2024)A Review of Immersive Technologies, Knowledge Representation, and AI for Human-Centered Digital ExperiencesElectronics10.3390/electronics1302026913:2(269)Online publication date: 7-Jan-2024
      • (2024)An Event-Centric Knowledge Graph Approach for Public Administration as an Enabler for Data AnalyticsComputers10.3390/computers1301001713:1(17)Online publication date: 5-Jan-2024
      • (2024)Enhancing Error Detection on Medical Knowledge Graphs via Intrinsic LabelBioengineering10.3390/bioengineering1103022511:3(225)Online publication date: 27-Feb-2024
      • (2024)RiQ-KGC: Relation Instantiation Enhanced Quaternionic Attention for Complex-Relation Knowledge Graph CompletionApplied Sciences10.3390/app1408322114:8(3221)Online publication date: 11-Apr-2024
      • (2024)Commonsense-Guided Inductive Relation Prediction with Dual Attention MechanismApplied Sciences10.3390/app1405204414:5(2044)Online publication date: 29-Feb-2024
      • (2024)A Robust Chinese Named Entity Recognition Method Based on Integrating Dual-Layer Features and CSBERTApplied Sciences10.3390/app1403106014:3(1060)Online publication date: 26-Jan-2024
      • Show More Cited By

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media