Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2740908.2742022acmotherconferencesArticle/Chapter ViewAbstractPublication PageswebconfConference Proceedingsconference-collections
research-article

What's in this paper?: Combining Rhetorical Entities with Linked Open Data for Semantic Literature Querying

Published: 18 May 2015 Publication History
  • Get Citation Alerts
  • Abstract

    Finding research literature pertaining to a task at hand is one of the essential tasks that scientists face on daily basis. Standard information retrieval techniques allow to quickly obtain a vast number of potentially relevant documents. Unfortunately, the search results then require significant effort for manual inspection, where we would rather select relevant publications based on more fine-grained, semantically rich queries involving a publication's contributions, methods, or application domains. We argue that a novel combination of three distinct methods can significantly advance this vision: (i) Natural Language Processing (NLP) for Rhetorical Entity (RE) detection; (ii) Named Entity (NE) recognition based on the Linked Open Data (LOD) cloud; and (iii) automatic generation of RDF triples for both NEs and REs using semantic web ontologies to interconnect them. Combined in a single workflow, these techniques allow us to automatically construct a knowledge base that facilitates numerous advanced use cases for managing scientific documents.

    References

    [1]
    C. Blake. Beyond genes, proteins, and abstracts: Identifying scientific claims from full-text biomedical articles. Journal of Biomedical Informatics, 43(2):173 -- 189, 2010.
    [2]
    H. Cunningham et al. Text Processing with GATE (Version 6) ARGE X. University of Sheffield, Department of Computer Science, 2011.
    [3]
    J. Daiber, M. Jakob, C. Hokamp, and P. N. Mendes. Improving Efficiency and Accuracy in Multilingual Entity Extraction. In Proc. of the 9th Intl. Conf. on Semantic Systems (I-Semantics), 2013.
    [4]
    A. Di Iorio, S. Peroni, and F. Vitali. Towards markup support for full GODDAGs and beyond: the EARMARK approach. In Proceedings of Balisage: The Markup Conference, 2009.
    [5]
    T. Groza, S. Handschuh, K. Möller, and S. Decker. SALT -- Semantically Annotated a for Scientific Publications. In The Semantic Web: Research and Applications, LNCS, pages 518--532. Springer, 2007.
    [6]
    T. Groza, S. Handschuh, K. Möller, and S. Decker. KonneX SALT: First Steps Towards a Semantic Claim Federation Infrastructure. In S. Bechhofer, M. Hauswirth, J. Hoffmann, and M. Koubarakis, editors, The Semantic Web: Research and Applications, volume 5021 of LNCS, pages 80--94. Springer Berlin Heidelberg, 2008.
    [7]
    T. Heath and C. Bizer. Linked Data: Evolving the Web into a Global Data Space. Synthesis lectures on the semantic web: theory and technology. Morgan & Claypool Publishers, 2011.
    [8]
    M. Liakata, S. Saha, S. Dobnik, C. R. Batchelor, and D. Rebholz-Schuhmann. Automatic recognition of conceptualization zones in scientific articles and two life science applications. Bioinformatics, 28(7):991--1000, 2012.
    [9]
    M. Liakata and L. Soldatova. Guidelines for the annotation of general scientific concepts. Technical report, Aberystwyth University, 2008. JISC Project Report, http://ie-repository.jisc.ac.uk/88.
    [10]
    M. Liakata, S. Teufel, A. Siddharthan, and C. R. Batchelor. Corpora for the Conceptualisation and Zoning of Scientific Papers. In LREC, 2010.
    [11]
    A. Malhotra, E. Younesi, H. Gurulingappa, and M. Hofmann-Apitius. 'Hypothesis Finder:' A Strategy for the Detection of Speculative Statements in Scientific Text. PLoS computational biology, 9(7):e1003117, 2013.
    [12]
    P. N. Mendes, M. Jakob, A. García-Silva, and C. Bizer. DBpedia Spotlight: Shedding Light on the Web of Documents. In Proc. of the 7th International Conf. on Semantic Systems, pages 1--8. ACM, 2011.
    [13]
    A. Naak, H. Hage, and E. Aimeur. Papyres: A Research Paper Management System. In E-Commerce Technology and the Fifth IEEE Conference on Enterprise Computing, E-Commerce and E-Services, 2008 10th IEEE Conference on, pages 201--208, July 2008.
    [14]
    B. Sateli and R. Witte. Supporting Researchers with a Semantic Literature Management Wiki. In The 4th Workshop on Semantic Publishing (SePublica 2014), volume 1155 of CEUR Workshop Proceedings, Anissaras, Crete, Greece, May 25 2014.
    [15]
    D. Shotton, K. Portwin, G. Klyne, and A. Miles. Adventures in semantic publishing: exemplar semantic enhancements of a research article. PLoS Computational Biology, 5(4):e1000361, 2009.
    [16]
    L. N. Soldatova, A. Clare, A. Sparkes, and R. D. King. An ontology for a Robot Scientist. Bioinformatics, 22(14):e464--e471, 2006.
    [17]
    S. Teufel. The Structure of Scientific Articles: Applications to Citation Indexing and Summarization. Center for the Study of Language and Information, 2010.
    [18]
    S. Teufel, A. Siddharthan, and C. R. Batchelor. Towards Discipline-independent Argumentative Zoning: Evidence from Chemistry and Computational Linguistics. In EMNLP, pages 1493--1502, Stroudsburg, PA, USA, 2009. ACL.
    [19]
    R. Usbeck, A.-C. Ngonga Ngomo, S. Auer, D. Gerber, and A. Both. AGDISTIS - Graph-Based Disambiguation of Named Entities using Linked Data. In International Semantic Web Conference (ISWC), LNCS. Springer, 2014.
    [20]
    M. William and S. Thompson. Rhetorical structure theory: Towards a functional theory of text organization. Text, 8(3):243--281, 1988.
    [21]
    M. A. Yosef, J. Hoffart, I. Bordino, M. Spaniol, and G. Weikum. AIDA: An online tool for accurate disambiguation of named entities in text and tables. Proc. VLDB, 4(12):1450--1453, 2011.

    Cited By

    View all
    • (2022)Constructing a high-quality dataset for automated creation of summaries of fundamental contributions of research articlesScientometrics10.1007/s11192-022-04380-z127:12(7061-7075)Online publication date: 28-Apr-2022
    • (2020)From Publications to Knowledge GraphsInformation Search, Integration, and Personalization10.1007/978-3-030-44900-1_2(18-33)Online publication date: 27-Mar-2020
    • (2018)TSE-NER: An Iterative Approach for Long-Tail Entity Extraction in Scientific PublicationsThe Semantic Web – ISWC 201810.1007/978-3-030-00671-6_8(127-143)Online publication date: 18-Sep-2018
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    WWW '15 Companion: Proceedings of the 24th International Conference on World Wide Web
    May 2015
    1602 pages
    ISBN:9781450334730
    DOI:10.1145/2740908

    Sponsors

    • IW3C2: International World Wide Web Conference Committee

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 18 May 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. natural language processing
    2. semantic publishing
    3. semantic web

    Qualifiers

    • Research-article

    Conference

    WWW '15
    Sponsor:
    • IW3C2

    Acceptance Rates

    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)0

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Constructing a high-quality dataset for automated creation of summaries of fundamental contributions of research articlesScientometrics10.1007/s11192-022-04380-z127:12(7061-7075)Online publication date: 28-Apr-2022
    • (2020)From Publications to Knowledge GraphsInformation Search, Integration, and Personalization10.1007/978-3-030-44900-1_2(18-33)Online publication date: 27-Mar-2020
    • (2018)TSE-NER: An Iterative Approach for Long-Tail Entity Extraction in Scientific PublicationsThe Semantic Web – ISWC 201810.1007/978-3-030-00671-6_8(127-143)Online publication date: 18-Sep-2018
    • (2018)Ontology Driven Extraction of Research ProcessesThe Semantic Web – ISWC 201810.1007/978-3-030-00671-6_10(162-178)Online publication date: 8-Oct-2018
    • (2017)Semantic Annotation of Data Processing Pipelines in Scientific PublicationsThe Semantic Web10.1007/978-3-319-58068-5_20(321-336)Online publication date: 16-May-2017
    • (2017)Knowledge Extraction and Modeling from Scientific PublicationsSemantics, Analytics, Visualization. Enhancing Scholarly Data10.1007/978-3-319-53637-8_2(11-25)Online publication date: 10-May-2017
    • (2016)Combining NLP And Semantics For Mining Software Technologies From Research PublicationsProceedings of the 25th International Conference Companion on World Wide Web10.1145/2872518.2889358(23-24)Online publication date: 11-Apr-2016
    • (2016)Unsupervised Relation Extraction in Specialized Corpora Using Sequence MiningAdvances in Intelligent Data Analysis XV10.1007/978-3-319-46349-0_21(237-248)Online publication date: 21-Sep-2016
    • (2016)Automatic Construction of a Semantic Knowledge Base from CEUR Workshop ProceedingsSemantic Web Evaluation Challenges10.1007/978-3-319-25518-7_11(129-141)Online publication date: 7-Jan-2016

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media