Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2063518.2063519acmotherconferencesArticle/Chapter ViewAbstractPublication PagessemanticsConference Proceedingsconference-collections
research-article

DBpedia spotlight: shedding light on the web of documents

Published: 07 September 2011 Publication History
  • Get Citation Alerts
  • Abstract

    Interlinking text documents with Linked Open Data enables the Web of Data to be used as background knowledge within document-oriented applications such as search and faceted browsing. As a step towards interconnecting the Web of Documents with the Web of Data, we developed DBpedia Spotlight, a system for automatically annotating text documents with DBpedia URIs. DBpedia Spotlight allows users to configure the annotations to their specific needs through the DBpedia Ontology and quality measures such as prominence, topical pertinence, contextual ambiguity and disambiguation confidence. We compare our approach with the state of the art in disambiguation, and evaluate our results in light of three baselines and six publicly available annotation systems, demonstrating the competitiveness of our system. DBpedia Spotlight is shared as open source and deployed as a Web Service freely available for public use.

    References

    [1]
    A. V. Aho and M. J. Corasick. Efficient string matching: an aid to bibliographic search. Commun. ACM, 18:333--340, June 1975.
    [2]
    Alias-i. LingPipe 4.0.0. http://alias-i.com/lingpipe, retrieved on 24.08.2010, 2008.
    [3]
    C. Bizer, T. Heath, and T. Berners-Lee. Linked data - the story so far. Int. J. Semantic Web Inf. Syst., 5(3):1--22, 2009.
    [4]
    C. Bizer, J. Lehmann, G. Kobilarov, S. Auer, C. Becker, R. Cyganiak, and S. Hellmann. DBpedia - A crystallization point for the Web of Data. Web Semantics: Science, Services and Agents on the World Wide Web, 7:154--165, September 2009.
    [5]
    M. Buckland and F. Gey. The relationship between Recall and Precision. J. Am. Soc. Inf. Sci., 45(1):12--19, January 1994.
    [6]
    R. C. Bunescu and M. Pasca. Using encyclopedic knowledge for named entity disambiguation. In EACL, 2006.
    [7]
    S. Cucerzan. Large-scale named entity disambiguation based on wikipedia data. In EMNLP-CoNLL, pages 708--716, 2007.
    [8]
    G. de Melo and G. Weikum. Language as a foundation of the Semantic Web. In C. Bizer and A. Joshi, editors, Proceedings of the Poster and Demonstration Session at the 7th International Semantic Web Conference (ISWC 2008), volume 401 of CEUR WS, Karlsruhe, Germany, 2008. CEUR.
    [9]
    H. Deng, I. King, and M. R. Lyu. Entropy-biased models for query representation on the click graph. In SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, pages 339--346, New York, NY, USA, 2009. ACM.
    [10]
    S. Dill, N. Eiron, D. Gibson, D. Gruhl, R. Guha, A. Jhingran, T. Kanungo, S. Rajagopalan, A. Tomkins, J. A. Tomlin, and J. Y. Zien. Semtag and seeker: bootstrapping the semantic web via automated semantic annotation. In Proceedings of the 12th international conference on World Wide Web, WWW '03, pages 178--186, New York, NY, USA, 2003. ACM.
    [11]
    A. Fader, S. Soderland, and O. Etzioni. Scaling wikipedia-based named entity disambiguation to arbitrary web text. In Proceedings of the WikiAI 09 - IJCAI Workshop: User Contributed Knowledge and Artificial Intelligence: An Evolving Synergy, Pasadena, CA, USA, July 2009.
    [12]
    D. Gruhl, M. Nagarajan, J. Pieper, C. Robson, and A. P. Sheth. Context and domain knowledge enhanced entity spotting in informal text. In International Semantic Web Conference, pages 260--276, 2009.
    [13]
    R. V. Guha and R. McCool. Tap: A semantic web test-bed. J. Web Sem., 1(1):81--87, 2003.
    [14]
    J. Hassell, B. Aleman-Meza, and I. Arpinar. Ontology-driven automatic entity disambiguation in unstructured text. In I. Cruz, S. Decker, D. Allemang, C. Preist, D. Schwabe, P. Mika, M. Uschold, and L. Aroyo, editors, The Semantic Web - ISWC 2006, volume 4273 of Lecture Notes in Computer Science, pages 44--57. Springer Berlin/Heidelberg, 2006.
    [15]
    M. Hearst. UIs for Faceted Navigation: Recent Advances and Remaining Open Problems. In Workshop on Computer Interaction and Information Retrieval, HCIR, Redmond, WA, Oct. 2008.
    [16]
    K. S. Jones. A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 28:11--21, 1972.
    [17]
    S. Kulkarni, A. Singh, G. Ramakrishnan, and S. Chakrabarti. Collective annotation of wikipedia entities in web text. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '09. pages 457--466, New York, NY, USA, 2009. ACM.
    [18]
    P. N. Mendes, A. Passant, P. Kapanipathi, and A. P. Sheth. Linked open social signals. In Web Intelligence and Intelligent Agent Technology, 2010. WI-IAT '10. IEEE/WIC/ACM International Conference on, 2010.
    [19]
    R. Mihalcea and A. Csomai. Wikify!: linking documents to encyclopedic knowledge. In CIKM '07: Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, pages 233--242, New York, NY, USA, 2007. ACM.
    [20]
    D. Milne and I. H. Witten. Learning to link with wikipedia. In Proceeding of the 17th ACM conference on Information and knowledge management, CIKM '08, pages 509--518, New York, NY, USA, 2008. ACM.
    [21]
    M. Rowe. Applying semantic social graphs to disambiguate identity references. In L. Aroyo, P. Traverso, F. Ciravegna, P. Cimiano, T. Heath, E. Hyvönen, R. Mizoguchi, E. Oren, M. Sabou, and E. Simperl, editors, The Semantic Web: Research and Applications, volume 5554 of Lecture Notes in Computer Science, pages 461--475. Springer Berlin/Heidelberg, 2009.
    [22]
    G. Salton, A. Wong, and C. S. Yang. A vector space model for automatic indexing. Communications of the ACM, 18:613--620, November 1975.
    [23]
    C. E. Shannon. Prediction and entropy of printed english. Bell Systems Technical Journal, pages 50--64, 1951.
    [24]
    R. Volz, J. Kleb, and W. Mueller. Towards ontology-based disambiguation of geographical identifiers. In I3, 2007.

    Cited By

    View all
    • (2024)The RDF2vec family of knowledge graph embedding methodsSemantic Web10.3233/SW-233514(1-32)Online publication date: 25-Jan-2024
    • (2024)ADOxx: Eine Low-Code-Plattform für die Entwicklung von ModellierungswerkzeugenADOxx: A Low-Code Platform for the Development of Modeling ToolsHMD Praxis der Wirtschaftsinformatik10.1365/s40702-024-01096-xOnline publication date: 2-Aug-2024
    • (2024)TelarKG: a Knowledge Graph of Chile's Constitutional ProcessProceedings of the 7th Joint Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA)10.1145/3661304.3661899(1-5)Online publication date: 14-Jun-2024
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    I-Semantics '11: Proceedings of the 7th International Conference on Semantic Systems
    September 2011
    129 pages
    ISBN:9781450306218
    DOI:10.1145/2063518
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 September 2011

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. DBpedia
    2. linked data
    3. named entity disambiguation
    4. text annotation

    Qualifiers

    • Research-article

    Conference

    I-Semantics '11

    Acceptance Rates

    Overall Acceptance Rate 40 of 182 submissions, 22%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)118
    • Downloads (Last 6 weeks)13
    Reflects downloads up to 12 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)The RDF2vec family of knowledge graph embedding methodsSemantic Web10.3233/SW-233514(1-32)Online publication date: 25-Jan-2024
    • (2024)ADOxx: Eine Low-Code-Plattform für die Entwicklung von ModellierungswerkzeugenADOxx: A Low-Code Platform for the Development of Modeling ToolsHMD Praxis der Wirtschaftsinformatik10.1365/s40702-024-01096-xOnline publication date: 2-Aug-2024
    • (2024)TelarKG: a Knowledge Graph of Chile's Constitutional ProcessProceedings of the 7th Joint Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA)10.1145/3661304.3661899(1-5)Online publication date: 14-Jun-2024
    • (2024)On the Opportunities and Challenges of Foundation Models for GeoAI (Vision Paper)ACM Transactions on Spatial Algorithms and Systems10.1145/365307010:2(1-46)Online publication date: 1-Jul-2024
    • (2024)Learner Modeling and Recommendation of Learning Resources using Personal Knowledge GraphsProceedings of the 14th Learning Analytics and Knowledge Conference10.1145/3636555.3636881(273-283)Online publication date: 18-Mar-2024
    • (2024)Transparent Learner Knowledge State Modeling using Personal Knowledge Graphs and Graph Neural NetworksAdjunct Proceedings of the 32nd ACM Conference on User Modeling, Adaptation and Personalization10.1145/3631700.3665230(591-596)Online publication date: 27-Jun-2024
    • (2024)PKG API: A Tool for Personal Knowledge Graph ManagementCompanion Proceedings of the ACM on Web Conference 202410.1145/3589335.3651247(1051-1054)Online publication date: 13-May-2024
    • (2024)Doc‐KG: Unstructured documents to knowledge graph construction, identification and validation with WikidataExpert Systems10.1111/exsy.13617Online publication date: 8-May-2024
    • (2024)Unleashing Competitive Intelligence: News Mining Analysis on Technology Trends and Digital Health Driving Healthcare InnovationIEEE Transactions on Engineering Management10.1109/TEM.2023.332623371(12311-12325)Online publication date: 2024
    • (2024)Understanding the impact of geotagging on location inference models for accurate generalization to non-geotagged datasetsGeomatica10.1016/j.geomat.2024.10000476:1(100004)Online publication date: Jul-2024
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media