Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3341105.3375753acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

An analysis of mapping strategies for storing RDF data into NoSQL databases

Published: 30 March 2020 Publication History

Abstract

RDF and SPARQL are increasingly considered in a broad range of information management scenarios. Governments, large corporations, startups, open data initiatives and other organizations are using RDF as a data model with the purpose of sharing information and knowledge in several domains. In such scenario, scalability is the main issue for virtually all the recently proposed triplestores. Thus, many triplestores are employing NoSQL databases on their storage layers. In this paper we present a survey about mapping strategies for storing RDF data into NoSQL databases. Most of them considers the query response as the main measure for the quality of the approach. Thus, the impact of the mapping design is hidden behind optimization, partitioning and indexing issues. The main contribution of this survey is to discuss the benefits and disadvantages of each strategy.

References

[1]
Daniel J Abadi et al. 2009. SW-Store: A Vertically Partitioned DBMS for Semantic Web Data Management. The VLDB Journal 18, 2 (2009), 385--406.
[2]
Theodore Andronikos, Alexander Singh, Konstantinos Giannakis, and Spyros Sioutas. 2017. Computing Probabilistic Queries in the Presence of Uncertainty via Probabilistic Automata. In Algorithmic Aspects of Cloud Computing - Third International Workshop, ALGOCLOUD. 106--120.
[3]
Andrés Aranda-Andújar et al. 2012. AMADA: Web Data Repositories in the Amazon Cloud. In 21st ACM international conference on Information and Knowledge Management. ACM, 2749--2751.
[4]
Raouf Bouhali and Anne Laurent. 2015. Exploiting RDF Open Data Using NoSQL Graph Databases. In IFIP International Conference on Artificial Intelligence Applications and Innovations. Springer, 177--190.
[5]
Pilsik Choi et al. 2013. RDFChain: Chain Centric Storage for Scalable Join Processing of RDF Graphs using MapReduce and HBase. In International Conference on Posters & Demonstrations Track-Volume 1035. CEUR-WS.org, 249--252.
[6]
Philippe Cudré-Mauroux et al. 2013. NoSQL Databases for RDF: An Empirical Evaluation. In International Semantic Web Conference. Springer, 310--325.
[7]
Nahla Mohammed Elzein et al. 2018. Managing Big RDF Data in Clouds: Challenges, Opportunities, and Solutions. Sustainable Cities and Society 39 (2018), 375--386.
[8]
David C Faye et al. 2012. A Survey of RDF Storage Approaches. Arima Journal 15 (2012), 11--35.
[9]
Damien Graux et al. 2018. A Multi-Criteria Experimental Ranking of Distributed SPARQL Evaluators. In International Conference on Big Data. IEEE, 693--702.
[10]
Rong Gu et al. 2015. Rainbow: A Distributed and Hierarchical RDF Triple Store with Dynamic Scalability. International Conference on Big Data, 561--566.
[11]
Yuanbo Guo et al. 2005. LUBM: A Benchmark for OWL Knowledge Base Systems. Web Semantics: Science, S. and Agents on the WWW 3, 2 (2005), 158--182.
[12]
Chunming Hu et al. 2016. ScalaRDF: a Distributed, Elastic and Scalable In-memory RDF Triple Store. In 22nd International Conference on Parallel and Distributed Systems. IEEE, 593--601.
[13]
Sergio Ilarri et al. 2015. Semantic Management of Moving Objects: A Vision Towards Smart Mobility. Expert Systems with Applications 42, 3 (2015), 1418--1435.
[14]
Zoi Kaoudi and Ioana Manolescu. 2015. RDF in the Clouds: A Survey. The VLDB Journal 24, 1 (2015), 67--91.
[15]
Vaibhav Khadilkar et al. 2012. Jena-HBase: A Distributed, Scalable and Efficient RDF Triple Store. CEUR Workshop Proceedings 914, ii (2012), 85--88.
[16]
Günter Ladwig and Andreas Harth. 2011. CumulusRDF: Linked Data Management on Nested Key-value Stores. In 7th International Workshop on Scalable Semantic Web Knowledge Base Systems. 30.
[17]
Zongmin Ma et al. 2016. Storing Massive Resource Description Framework (RDF) Data: A Survey. The Knowledge Engineering Review 31, 04 (2016), 391--413.
[18]
Mulugeta Mammo and Srividya K Bansal. 2015. Presto-RDF: SPARQL Querying over Big RDF Data. In Australasian Database Conference. Springer, 281--293.
[19]
James McGlothlin and L Khan. 2009. RDFJoin: A Scalable Data Model for Persistence and Efficient Querying of RDF Datasets. Database (2009).
[20]
James P McGlothlin and Latifur R Khan. 2009. RDFKB: Efficient Support for RDF Inference Queries and Knowledge Management. In International Database Engineering & Applications Symposium. ACM, 259--266.
[21]
Gianfranco E Modoni et al. 2014. A Survey of RDF Store Solutions. In International ICE Conference on Engineering, Technology and Innovation. IEEE, 1--7.
[22]
M Tamer Özsu. 2016. A Survey of RDF Data Management Systems. Frontiers of Computer Science 10, 3 (2016), 418--432.
[23]
Nikolaos Papailiou et al. 2015. Graph-aware, Workload-adaptive SPARQL Query Caching. In International Conference on Management of Data (SIGMOD). ACM, 1777--1792.
[24]
M Pham. 2013. Self-organizing Structured RDF in MonetDB. In 29th International Conference on Data Engineering Workshops. IEEE, 310--313.
[25]
Martin Przyjaciel-Zablocki et al. 2013. Map-Side Merge Joins for Scalable SPARQL BGP Processing. In 5th International Conference on Cloud Computing Technology and Science, Vol. 1. IEEE, 631--638.
[26]
Roshan Punnoose et al. 2012. Rya: A Scalable RDF Triple Store for the Clouds. In 1st International Workshop on Cloud Intelligence. ACM, 4.
[27]
Luiz H Z Santana and Ronaldo dos S Mello. 2017. Workload-Aware RDF Partitioning and SPARQL Query Caching for Massive RDF Graphs stored in NoSQL Databases. In Brazilian Symposium on Databases, 2017. SBC, 184--195.
[28]
Alexander Schätzle et al. 2014. Sempala: Interactive SPARQL Query Processing on Hadoop. In International Semantic Web Conference. Springer, 164--179.
[29]
Alexander Schätzle et al. 2016. S2RDF: RDF Querying with SPARQL on Spark. Proceedings of the VLDB Endowment 9, 10 (2016), 804--815.
[30]
Raffael Stein and Valentin Zacharias. 2010. RDF on Cloud Number Nine. In 4th Workshop on New Forms of Reasoning for the Semantic Web: Scalable and Dynamic. 11--23.
[31]
Dominik Tomaszuk. 2010. Document-oriented Triplestore based on RDF/JSON. Studies in Logic, Grammar and Rhetoric,(22 (35)) (2010), 130.
[32]
Farhan Ullah et al. 2017. Semantic Interoperability for Big Data in Heterogeneous IoT Infrastructure for Healthcare. Sustainable Cities and Society 34 (2017), 90--96.
[33]
Cathrin Weiss et al. 2008. Hexastore: Sextuple Indexing for Semantic Web Data Management. Proceedings of the VLDB Endowment 1, 1 (2008), 1008--1019.
[34]
Kai Zeng et al. 2013. A Distributed Graph Engine for Web Scale RDF Data. In Proceedings of the VLDB Endowment, Vol. 6. VLDB Endowment, 265--276.

Cited By

View all
  • (2022)RDF/OWL storage and management in relational database management systemsJournal of King Saud University - Computer and Information Sciences10.1016/j.jksuci.2021.08.01834:9(7604-7620)Online publication date: 1-Oct-2022
  • (2022)Efficient distributed path computation on RDF knowledge graphs using partial evaluationWorld Wide Web10.1007/s11280-021-00965-525:2(1005-1036)Online publication date: 1-Mar-2022
  • (2021)An adaptive spark-based framework for querying large-scale NoSQL and relational databasesPLOS ONE10.1371/journal.pone.025556216:8(e0255562)Online publication date: 19-Aug-2021
  • Show More Cited By

Index Terms

  1. An analysis of mapping strategies for storing RDF data into NoSQL databases

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SAC '20: Proceedings of the 35th Annual ACM Symposium on Applied Computing
    March 2020
    2348 pages
    ISBN:9781450368667
    DOI:10.1145/3341105
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 30 March 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. NoSQL
    2. RDF
    3. RDF partitioning
    4. RDF to NoSQL mapping

    Qualifiers

    • Research-article

    Conference

    SAC '20
    Sponsor:
    SAC '20: The 35th ACM/SIGAPP Symposium on Applied Computing
    March 30 - April 3, 2020
    Brno, Czech Republic

    Acceptance Rates

    Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)11
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 09 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)RDF/OWL storage and management in relational database management systemsJournal of King Saud University - Computer and Information Sciences10.1016/j.jksuci.2021.08.01834:9(7604-7620)Online publication date: 1-Oct-2022
    • (2022)Efficient distributed path computation on RDF knowledge graphs using partial evaluationWorld Wide Web10.1007/s11280-021-00965-525:2(1005-1036)Online publication date: 1-Mar-2022
    • (2021)An adaptive spark-based framework for querying large-scale NoSQL and relational databasesPLOS ONE10.1371/journal.pone.025556216:8(e0255562)Online publication date: 19-Aug-2021
    • (2021)A survey of RDF stores & SPARQL engines for querying knowledge graphsThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-021-00711-331:3(1-26)Online publication date: 13-Nov-2021
    • (2020)Storage, partitioning, indexing and retrieval in Big RDF frameworks: A surveyComputer Science Review10.1016/j.cosrev.2020.10030938(100309)Online publication date: Nov-2020
    • (2020)GDBApex: A graph‐based system to enable efficient transformation of enterprise infrastructuresSoftware: Practice and Experience10.1002/spe.287151:3(517-531)Online publication date: Jul-2020

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media