Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2938503.2938548acmotherconferencesArticle/Chapter ViewAbstractPublication PagesideasConference Proceedingsconference-collections
short-paper

JARS: Join-Aware Distributed RDF Storage

Published: 11 July 2016 Publication History

Abstract

The enormous increase of data in RDF format calls for efficient storage and retrieval approaches. Being a highly connected data, RDF generates massive amounts of intermediate results during query processing. Many of the current RDF storage approaches involve large amounts of inter-node data movement even for simple selective query patterns. We propose JARS, a join-aware distributed RDF storage system with a dual-hash partitioning strategy coupled with two layered distributed clustered indexing and a rule-based query-execution approach. JARS eliminates the inter-node communication for star patterns and mitigates the communication cost for chain pattern SPARQL queries. Our experiments indicate that JARS achieves significant performance enhancement over the state-of-the-art RDF storage systems.

References

[1]
D. J. Abadi, A. Marcus, S. R. Madden, and K. Hollenbach. Scalable semantic web data management using vertical partitioning. In Proceedings of the 33rd international conference on Very large data bases, pages 411--422. VLDB Endowment, 2007.
[2]
C. Bizer, T. Heath, K. Idehen, and T. Berners-Lee. Linked data on the web (ldow2008). In Proceedings of the 17th international conference on World Wide Web, pages 1265--1266. ACM, 2008.
[3]
M. A. Bornea, J. Dolby, A. Kementsietsidis, K. Srinivas, P. Dantressangle, O. Udrea, and B. Bhattacharjee. Building an efficient rdf store over a relational database. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pages 121--132. ACM, 2013.
[4]
O. Erling. Virtuoso, a hybrid rdbms/graph column store. IEEE Data Eng. Bull., 35(1):3--8, 2012.
[5]
Y. Guo, Z. Pan, and J. Heflin. Lubm: A benchmark for owl knowledge base systems. Web Semantics: Science, Services and Agents on the World Wide Web, 3(2):158--182, 2005.
[6]
A. Harth, J. Umbrich, A. Hogan, and S. Decker. Yars2: A federated repository for querying graph structured data from the web. Springer, 2007.
[7]
A. Holmes. Hadoop in practice. Manning Publications Co., 2012.
[8]
J. Huang, D. J. Abadi, and K. Ren. Scalable sparql querying of large rdf graphs. Proceedings of the VLDB Endowment, 4(11):1123--1134, 2011.
[9]
M. Husain, J. McGlothlin, M. M. Masud, L. Khan, and B. Thuraisingham. Heuristics-based query processing for large rdf graphs using cloud computing. Knowledge and Data Engineering, IEEE Transactions on, 23(9):1312--1327, 2011.
[10]
Jena, 2011. https://jena.apache.org/.
[11]
K. Lee and L. Liu. Scaling queries over big rdf graphs with semantic hash partitioning. Proceedings of the VLDB Endowment, 6(14):1894--1905, 2013.
[12]
Md5, 1992. https://en.wikipedia.org/wiki/MD5.
[13]
T. Neumann and G. Weikum. The rdf-3x engine for scalable management of rdf data. The VLDB Journal, 19(1):91--113, 2010.
[14]
A. Owens, A. Seaborne, N. Gibbins, et al. Clustered tdb: a clustered triple store for jena. 2008.
[15]
K. Rohloff and R. E. Schantz. High-performance, massively scalable distributed systems using the mapreduce software framework: the shard triple-store. In Programming Support Innovations for Emerging Distributed Applications, page 4. ACM, 2010.
[16]
M. Schmidt, T. Hornung, G. Lausen, and C. Pinkel. Sp^ 2bench: a sparql performance benchmark. In Data Engineering, 2009. ICDE'09. IEEE 25th International Conference on, pages 222--233. IEEE, 2009.
[17]
Sparql, 2008. https://en.wikipedia.org/wiki/SPARQL.
[18]
X. Wang, T. Yang, J. Chen, L. He, and X. Du. Rdf partitioning for scalable sparql query processing. Frontiers of Computer Science, 9(6):919--933, 2015.
[19]
C. Weiss, P. Karras, and A. Bernstein. Hexastore: sextuple indexing for semantic web data management. Proceedings of the VLDB Endowment, 1(1):1008--1019, 2008.

Cited By

View all
  • (2020)Storage, partitioning, indexing and retrieval in Big RDF frameworks: A surveyComputer Science Review10.1016/j.cosrev.2020.10030938(100309)Online publication date: Nov-2020
  • (2019)HyPSoProceedings of the ACM India Joint International Conference on Data Science and Management of Data10.1145/3297001.3297025(188-194)Online publication date: 3-Jan-2019
  • (2018)JOTR: Join-Optimistic Triple Reordering Approach for SPARQL Query Optimization on Big RDF Data2018 9th International Conference on Computing, Communication and Networking Technologies (ICCCNT)10.1109/ICCCNT.2018.8493743(1-7)Online publication date: Jul-2018
  • Show More Cited By

Index Terms

  1. JARS: Join-Aware Distributed RDF Storage

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    IDEAS '16: Proceedings of the 20th International Database Engineering & Applications Symposium
    July 2016
    420 pages
    ISBN:9781450341189
    DOI:10.1145/2938503
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    In-Cooperation

    • Keio University: Keio University

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 July 2016

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Clustered Index
    2. Dual-hash Distribution
    3. RDF
    4. SPARQL

    Qualifiers

    • Short-paper
    • Research
    • Refereed limited

    Conference

    IDEAS '16

    Acceptance Rates

    Overall Acceptance Rate 74 of 210 submissions, 35%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 25 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2020)Storage, partitioning, indexing and retrieval in Big RDF frameworks: A surveyComputer Science Review10.1016/j.cosrev.2020.10030938(100309)Online publication date: Nov-2020
    • (2019)HyPSoProceedings of the ACM India Joint International Conference on Data Science and Management of Data10.1145/3297001.3297025(188-194)Online publication date: 3-Jan-2019
    • (2018)JOTR: Join-Optimistic Triple Reordering Approach for SPARQL Query Optimization on Big RDF Data2018 9th International Conference on Computing, Communication and Networking Technologies (ICCCNT)10.1109/ICCCNT.2018.8493743(1-7)Online publication date: Jul-2018
    • (2017)DWAHPProceedings of the 21st International Database Engineering & Applications Symposium10.1145/3105831.3105864(235-241)Online publication date: 12-Jul-2017

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media