Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1007/978-3-030-54956-5_1guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Requirements Analysis for an Open Research Knowledge Graph

Published: 25 August 2020 Publication History

Abstract

Current science communication has a number of drawbacks and bottlenecks which have been subject of discussion lately: Among others, the rising number of published articles makes it nearly impossible to get a full overview of the state of the art in a certain field, or reproducibility is hampered by fixed-length, document-based publications which normally cannot cover all details of a research work. Recently, several initiatives have proposed knowledge graphs (KGs) for organising scientific information as a solution to many of the current issues. The focus of these proposals is, however, usually restricted to very specific use cases. In this paper, we aim to transcend this limited perspective by presenting a comprehensive analysis of requirements for an Open Research Knowledge Graph (ORKG) by (a) collecting daily core tasks of a scientist, (b) establishing their consequential requirements for a KG-based system, (c) identifying overlaps and specificities, and their coverage in current solutions. As a result, we map necessary and desirable requirements for successful KG-based science communication, derive implications and outline possible solutions.

References

[1]
Harris MAMA et al. Gene ontology consortium: The gene ontology (GO) database and informatics resource Nucleic Acids Res. 2004 32 D258-D261
[2]
Amir, A., Jing-bo, W.: Research graph: building a distributed graph of scholarly works using research data switchboard. In: Open Repositories CONFERENCE (2017)
[3]
Ammar, W., et al.: Construction of the literature graph in semantic scholar. In: NAACL-HLT (2018)
[4]
Auer, S.: Towards an open research knowledge graph (2018).
[5]
Augenstein, I., Das, M., Riedel, S., Vikraman, L., McCallum, A.: Semeval 2017 task 10: scienceie - extracting keyphrases and relations from scientific publications. In: SemEval@ACL (2017)
[6]
Balog K Entity-Oriented Search 2018 Heidelberg Springer
[7]
Bechhofer, S., et al.: Why linked data is not enough for scientists. In: 2010 IEEE 6th International Conference on e-Science (2010)
[8]
Beel J, Gipp B, Langer S, and Breitinger C Research-paper recommender systems: a literature survey Int. J. Digit. Libr. 2015 17 4 305-338
[9]
Beltagy, I., Lo, K., Cohan, A.: Scibert: pretrained language model for scientific text. In: EMNLP (2019)
[10]
Bodenreider O The unified medical language system (UMLS): integrating biomedical terminology Nucleic Acids Res. 2004 32 D267-D270
[11]
Vrandečić D and Krötzsch M Wikidata: a free collaborative knowledgebase Commun. ACM 2014 57 10 78-85
[12]
Bornmann L and Mutz R Growth rates of modern science: a bibliometric analysis based on the number of publications and cited references J. Assoc. Inf. Sci. Technol. 2015 66 11 2215-2222
[13]
Brack A, D’Souza J, Hoppe A, Auer S, Ewerth R, et al. Jose JM et al. Domain-Independent extraction of scientific concepts from research articles Advances in Information Retrieval 2020 Cham Springer 251-266
[14]
Braun R, Benedict M, Wendler H, and Esswein W Donnellan B, Helfert M, Kenneally J, VanderMeer D, Rothenberger M, and Winter R Proposal for requirements driven design science research New Horizons in Design Science: Broadening the Research Agenda 2015 Cham Springer 135-151
[15]
Brodaric, B., Reitsma, F., Qiang, Y.: Skiing with DOLCE: toward an e-science knowledge infrastructure. In: FOIS (2008)
[16]
Burton, A., et al.: The scholix framework for interoperability in data-literature information exchange. D-Lib Mag. 23(1/2) (2017)
[17]
Cohan, A., Ammar, W., van Zuylen, M., Cady, F.: Structural scaffolds for citation intent classification in scientific publications. In: NAACL-HLT (2019)
[18]
Cohan, A., Beltagy, I., King, D., Dalvi, B., Weld, D.S.: Pretrained language models for sequential sentence classification. In: EMNLP (2019)
[19]
Constantin A, Peroni S, Pettifer S, Shotton DM, and Vitali F The document components ontology (DoCO) Seman. Web 2016 7 2 167-181
[20]
Degbelo, A.: A snapshot of ontology evaluation criteria and strategies. In: SEMANTICS, pp. 1–8. ACM (2017)
[21]
Degtyarenko K et al. Chebi: a database and ontology for chemical entities of biological interest Nucleic Acids Res. 2008 36 344-350
[22]
Dernoncourt, F., Lee, J.Y.: 200k RCT: a dataset for sequential sentence classification in medical abstracts. In: IJCNLP (2017)
[23]
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT (2019)
[24]
Färber M Ghidini C The microsoft academic knowledge graph: a linked data source with 8 billion triples of scholarly data The Semantic Web – ISWC 2019 2019 Cham Springer 113-129
[25]
Fathalla S, Vahdati S, Auer S, and Lange C Kamps J, Tsakonas G, Manolopoulos Y, Iliadis L, and Karydis I Towards a knowledge graph representing research findings by semantifying survey articles Research and Advanced Technology for Digital Libraries 2017 Cham Springer 315-327
[26]
Fellbaum C WordNet: An Electronic Lexical Database. Language, Speech, and Communication 1998 Cambridge MIT Press
[27]
Fink A Conducting Research Literature Reviews 2014 Thousand Oaks SAGE Publications
[28]
Fisas, B., Saggion, H., Ronzano, F.: On the discoursive structure of computer graphics research papers. In: LAW@NAACL-HLT (2015)
[29]
Gábor, K., Buscaldi, D., Schumann, A.K., QasemiZadeh, B., Zargayouna, H., Charnois, T.: Semeval-2018 task 7: semantic relation extraction and classification in scientific papers. In: Proceedings of The 12th International Workshop on Semantic Evaluation (2018)
[30]
Groza, T., Kim, H., Handschuh, S.: Salt: semantically annotated latex. In: SAAW@ISWC (2006)
[31]
Handschuh, S., QasemiZadeh, B.: The ACL RD-TEC: a dataset for benchmarking terminology extraction and classification in computational linguistics. In: COLING 2014: 4th international workshop on computational terminology (2014)
[32]
Hars A Structure of Scientific Knowledge 2003 Heidelberg Springer
[33]
Hevner AR, March ST, Park J, and Ram S Design science in information systems research MIS Q. 2004 28 1 75-105
[34]
Hoppe A, Hagen J, Holzmann H, Kniesel G, and Ewerth R Méndez E, Crestani F, Ribeiro C, David G, and Lopes JC An analytics tool for exploring scientific software and related publications Digital Libraries for Open Knowledge 2018 Cham Springer 299-303
[35]
Horváth, I.: Comparison of three methodological approaches of design research. In: ICED (2007)
[36]
Hou, Y., Jochim, C., Gleize, M., Bonin, F., Ganguly, D.: Identification of tasks, datasets, evaluation metrics, and numeric scores for scientific leaderboards construction. In: ACL (2019)
[37]
Jaradeh MY, Oelen A, Prinz M, Stocker M, and Auer S Doucet A, Isaac A, Golub K, Aalberg T, and Jatowt A Open research knowledge graph: a system walkthrough Digital Libraries for Open Knowledge 2019 Cham Springer 348-351
[38]
Kim, S., Martínez, D., Cavedon, L., Yencken, L.: Automatic classification of sentences to support evidence based medicine. In: BMC Bioinformatics (2011)
[39]
Kitchenham, B., Charters, S.: Guidelines for performing systematic literature reviews in software engineering. Keele University and Durham University Joint Report, Technical report (2007)
[40]
Klampanos IA, Davvetas A, Koukourikos A, and Karkaletsis V Annett-o: an ontology for describing artificial neural network evaluation, topology and training IJMSO 2018 13 24-49
[41]
Kolitsas, N., Ganea, O.E., Hofmann, T.: End-to-end neural entity linking. In: CoNLL (2018)
[42]
Lange C Ontologies and languages for representing mathematical knowledge on the semantic web Semant. Web 2013 4 119-158
[43]
Lehmann J et al. Dbpedia - a large-scale, multilingual knowledge base extracted from wikipedia Semant. Web 2015 6 167-195
[44]
Liakata M, Saha S, Dobnik S, Batchelor C, and Rebholz-Schuhmann D Automatic recognition of conceptualization zones in scientific articles and two life science applications Bioinformatics 2012 28 7 991-1000
[45]
Liakata, M., Teufel, S., Siddharthan, A., Batchelor, C.R.: Corpora for the conceptualisation and zoning of scientific papers. In: LREC (2010)
[46]
Luan, Y., He, L., Ostendorf, M., Hajishirzi, H.: Multi-task identification of entities, relations, and coreference for scientific knowledge graph construction. In: EMNLP (2018)
[47]
Lubani M, Noah SAM, and Mahmud R Ontology population: Approaches and design aspects J. Inf. Sci. 2019 45 4 502-515
[48]
Manghi, P., et al.: The OpenAIRE research graph data model (2019).
[49]
Mesbah S, Fragkeskos K, Lofi C, Bozzon A, and Houben GJ Blomqvist E, Maynard D, Gangemi A, Hoekstra R, Hitzler P, and Hartig O Semantic annotation of data processing pipelines in scientific publications The Semantic Web 2017 Cham Springer 321-336
[50]
Nasar Z, Jaffry SW, and Malik MK Information extraction from scientific articles: a survey Scientometrics 2018 117 3 1931-1990
[51]
Oelen, A., Jaradeh, M.Y., Farfar, K.E., Stocker, M., Auer, S.: Comparing research contributions in a scholarly knowledge graph. In: SciKnow@K-CAP (2019)
[52]
Okoli C A guide to conducting a standalone systematic literature review CAIS 2015 37 43
[53]
Peroni S and Shotton DM Fabio and cito: ontologies for describing bibliographic resources and citations J. Web Semant. 2012 17 33-43
[54]
Pertsas V and Constantopoulos P Scholarly ontology: modelling scholarly practices Int. J. Digit. Libr. 2016 18 3 173-190
[55]
Petasis G, Karkaletsis V, Paliouras G, Krithara A, and Zavitsanos E Paliouras G, Spyropoulos CD, and Tsatsaronis G Ontology population and enrichment: state of the art Knowledge-Driven Multimedia Information Extraction and Ontology Evolution 2011 Heidelberg Springer 134-166
[56]
Pujara, J., Singh, S.: Mining knowledge graphs from text. In: WSDM 2018 (2018)
[57]
Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: Squad: 100, 000+ questions for machine comprehension of text. In: EMNLP (2016)
[58]
Ruiz Iniesta, A., Corcho, O.: A review of ontologies for describing scholarly and scientific documents. In: 4th Workshop on Semantic Publishing (SePublica) (2014)
[59]
Salatino, A.A., Thanapalasingam, T., Mannocci, A., Birukou, A., Osborne, F., Motta, E.: The computer science ontology: a comprehensive automatically-generated taxonomy of research areas. In: Data Intelligent (2019)
[60]
Singh, M., et al.: Ocr++: a robust framework for information extraction from scholarly articles. In: COLING (2016)
[61]
Soldatova LN and King RD An ontology of scientific experiments J. R. Soc. Interface 2006 3 795-803
[62]
Stocker M, Prinz M, Rostami F, and Kempf T Auer S and Vidal ME Towards research infrastructures that curate scientific information: a use case in life sciences Data Integration in the Life Sciences 2019 Cham Springer 61-74
[63]
Teufel, S., Siddharthan, A., Batchelor, C.: Towards discipline-independent argumentative zoning: evidence from chemistry and computational linguistics. In: EMNLP (2009)
[64]
Vahdati S, Fathalla S, Auer S, Lange C, and Vidal ME Doucet A, Isaac A, Golub K, Aalberg T, and Jatowt A Semantic representation of scientific publications Digital Libraries for Open Knowledge 2019 Cham Springer 375-379
[65]
Vrandečić, D., Krötzsch, M.: Wikidata: a free collaborative knowledgebase. Commun. ACM 57(10) (2014)
[66]
de Waard, A., Tel, G.: The ABCDE format enabling semantic conference proceedings. In: SemWiki (2006)
[67]
Xiong, C., Power, R., Callan, J.P.: Explicit semantic ranking for academic search via knowledge graph embedding. In: WWW (2017)
[68]
Yaman, B., Pasin, M., Freudenberg, M.: Interlinking scigraph and dbpedia datasets using link discovery and named entity recognition techniques. In: LDK (2019)

Index Terms

  1. Requirements Analysis for an Open Research Knowledge Graph
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image Guide Proceedings
      Digital Libraries for Open Knowledge: 24th International Conference on Theory and Practice of Digital Libraries, TPDL 2020, Lyon, France, August 25–27, 2020, Proceedings
      Aug 2020
      234 pages
      ISBN:978-3-030-54955-8
      DOI:10.1007/978-3-030-54956-5
      • Editors:
      • Mark Hall,
      • Tanja Merčun,
      • Thomas Risse,
      • Fabien Duchateau

      Publisher

      Springer-Verlag

      Berlin, Heidelberg

      Publication History

      Published: 25 August 2020

      Author Tags

      1. Scholarly communication
      2. Research Knowledge Graph
      3. Design science research
      4. Requirements analysis

      Qualifiers

      • Article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 0
        Total Downloads
      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 30 Jan 2025

      Other Metrics

      Citations

      View Options

      View options

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media