Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

ROXXI: Reviving witness dOcuments to eXplore eXtracted Information

Published: 01 September 2010 Publication History

Abstract

In recent years, there has been considerable research on information extraction and constructing RDF knowledge bases. In general, the goal is to extract all relevant information from a corpus of documents, store it into an ontology, and answer future queries based only on the created knowledge base. Thus, the original documents become dispensable. On the one hand, an ontology is a convenient and non-redundant structured source of information, based on which specific queries can be answered efficiently. On the other hand, many users doubt the correctness of facts and ontology subgraphs presented to them as query results without proof. Instead, users often wish to verify the obtained facts or subgraphs by reading about them in context, i.e., in a document relating the facts and providing background information. In this demo, we present ROXXI, a system operating on top of an existing knowledge base and reviving the abandoned witness documents. In doing so, it goes the opposite way of information extraction approaches -- starting with ontological facts and tracing their way back to the documents they were extracted from. ROXXI offers interfaces for expert users (SPARQL) as well as for non-experts (ontology browser) and provides a ranked list of documents each associated with a content snippet highlighting the queried facts in context. At the demonstration site, we will show the advantages of this novel approach towards document retrieval and illustrate the benefits of reviving the documents that information extraction approaches neglect.

References

[1]
S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, and Z. Ives. DBpedia: A Nucleus for a Web of Open Data. In ISWC/ASWC, pages 11--15, 2007.
[2]
F. Brauer, W. Barczynski, G. Hackenbroich, M. Schramm, A. Mocan, and F. Förster. RankIE: document retrieval on ranked entity graphs. Proc. VLDB Endow., 2(2):1578--1581, 2009.
[3]
S. Brin. Extracting Patterns and Relations from the World Wide Web. In WebDB, pages 172--183, 1999.
[4]
O. Etzioni, M. Cafarella, D. Downey, S. Kok, A.-M. Popescu, T. Shaked, S. Soderland, D. S. Weld, and A. Yates. Web-Scale Information Extraction in KnowItAll: (Preliminary Results). In WWW, pages 100--110, 2004.
[5]
Freebase: A social database about things you know and love. www.w3.org/RDF/.
[6]
G. Kasneci, F. M. Suchanek, G. Ifrim, M. Ramanath, and G. Weikum. NAGA: Searching and Ranking Knowledge. In ICDE, pages 953--962, 2008.
[7]
X. Liu and W. B. Croft. Statistical language modeling for information retrieval. The Annual Review of Information Science and Technology, 39:3--31, 2004.
[8]
F. Suchanek, G. Kasneci, and G. Weikum. YAGO -- A Large Ontology from Wikipedia and WordNet. Journal of Web Semantics, 6(3):203--217, 2008.
[9]
F. M. Suchanek, M. Sozio, and G. Weikum. SOFIE: A Self-Organizing Framework for Information Extraction. In WWW, pages 631--640, 2009.
[10]
R. W. White, I. Ruthven, and J. M. Jose. Finding relevant documents using top ranking sentences: an evaluation of two alternative schemes. In SIGIR, pages 57--64, 2002.

Cited By

View all
  • (2019)Structured Search in Annotated Document CollectionsProceedings of the Twelfth ACM International Conference on Web Search and Data Mining10.1145/3289600.3290618(794-797)Online publication date: 30-Jan-2019
  • (2019)Retrieving Textual Evidence for Knowledge Graph FactsThe Semantic Web10.1007/978-3-030-21348-0_4(52-67)Online publication date: 2-Jun-2019
  • (2018)That’s Interesting, Tell Me More! Finding Descriptive Support Passages for Knowledge Graph RelationshipsThe Semantic Web – ISWC 201810.1007/978-3-030-00671-6_15(250-267)Online publication date: 8-Oct-2018
  • Show More Cited By
  1. ROXXI: Reviving witness dOcuments to eXplore eXtracted Information

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the VLDB Endowment
    Proceedings of the VLDB Endowment  Volume 3, Issue 1-2
    September 2010
    1658 pages

    Publisher

    VLDB Endowment

    Publication History

    Published: 01 September 2010
    Published in PVLDB Volume 3, Issue 1-2

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 08 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2019)Structured Search in Annotated Document CollectionsProceedings of the Twelfth ACM International Conference on Web Search and Data Mining10.1145/3289600.3290618(794-797)Online publication date: 30-Jan-2019
    • (2019)Retrieving Textual Evidence for Knowledge Graph FactsThe Semantic Web10.1007/978-3-030-21348-0_4(52-67)Online publication date: 2-Jun-2019
    • (2018)That’s Interesting, Tell Me More! Finding Descriptive Support Passages for Knowledge Graph RelationshipsThe Semantic Web – ISWC 201810.1007/978-3-030-00671-6_15(250-267)Online publication date: 8-Oct-2018
    • (2012)LUKe and MIKeProceedings of the 21st ACM international conference on Information and knowledge management10.1145/2396761.2398721(2671-2673)Online publication date: 29-Oct-2012
    • (2011)S3KProceedings of the 20th ACM international conference on Information and knowledge management10.1145/2063576.2063587(37-46)Online publication date: 24-Oct-2011
    • (2011)Temporal Knowledge for Timely IntelligenceEnabling Real-Time Business Intelligence10.1007/978-3-642-22970-1_1(1-6)Online publication date: 2011

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media