Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1982185.1982536acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

Parallelizing join computations of SPARQL queries for large semantic web databases

Published: 21 March 2011 Publication History

Abstract

While a number of optimizing techniques have been developed to efficiently process increasing large Semantic Web databases, these optimization approaches have not fully leveraged the powerful computation capability of modern computers. Today's multi-core computers promise an enormous performance boost by providing a parallel computing platform. Although the parallel relational database systems have been well built, parallel query computing in Semantic Web databases have not extensively been studied. In this work, we develop the parallel algorithms for join computations of SPARQL queries. Our performance study shows that the parallel computation of SPARQL queries significantly speeds up querying large Semantic Web databases.

References

[1]
G. Amdahl: Validity of the Single Processor Approach to Achieving Large-Scale Computing Capabilities. AFIPS Conference Proceedings, (30), pp. 483--485, 1967.
[2]
D. Beckett (editor), RDF/XML Syntax Specification (Revised), W3C Recommendation, 10th February 2004.
[3]
H. Boral, W. Alexander, L. Clay, G. Copeland, S. Danforth, M. Franklin, B. Hart, M. Smith, P. Valduriez: Prototyping Budda: a highly parallel database system. IEEE KDE, 1990.
[4]
D. J. DeWitt, R. H. Gerber, G. Graefe, M. L. Heytens, K. B. Kumar. GAMMA - A High Performance Dataflow Database Machine, VLDB, 1986.
[5]
D. DeWitt, J. Gray: Parallel Database Systems: The Future of High Performance Database Systems. Communications of ACM, 35(6): 85--98, June 1992.
[6]
M. Dürst, M. Suignard, Internationalized Resource Identifiers (IRIs), http://www.ietf.org/rfc/rfc3987.txt, W3C Memo, 2005.
[7]
L. Feigenbaum (editor), DAWG Testcases, http://www.w3.org/2001/sw/DataAccess/tests/r2, 2008.
[8]
G. Graefe, Query evaluation techniques for large database. ACM Computing Surveys 25, 2 (June), 73--170, 1993.
[9]
J. Groppe, S. Groppe, S. Ebers and V. Linnemann, Efficient Processing of SPARQL Joins in Memory by Dynamically Restricting Triple Patterns, ACM SAC, 2009.
[10]
J. Groppe, S. Groppe, A. Schleifer, Volker Linnemann, LuposDate: A Semantic Web Database System, ACM CIKM, Hong Kong, China, 2009.
[11]
S. Groppe, J. Groppe, External Sorting for Index Construction of Large Semantic Web Databases, ACM SAC, 2010.
[12]
S. Groppe, J. Groppe, LUPOSDATE Demonstration, http://www.ifis.uni-luebeck.de/index.php?id=luposdate-demo, 2009.
[13]
A. Harth, J. Umbrich, A. Hogan, S. Decker. YARS2: A Federated Repository for Querying Graph Structured Data from the Web, ISWC, 2007.
[14]
C. A. R. Hoare: Monitors: An Operating System Structuring Concept. Commun. ACM 17(10): 549--557, 1974.
[15]
M. Kitsuregawa, H. Tanaka, T. Motooka: Application of Hash to Data Base Machine and Its Architecture, New Generation Computing, Vol. 1, No. 1, 1983.
[16]
M. Kitsuregawa, Y. Ogawa: A New Parallel Hash Join Method with Robustness for Data Skew in Super Database Computer (SDC), VLDB, Melbourne, Australia, 1990.
[17]
P. Mishra, M. Eich, Join processing in relational databases, ACM Computing Surveys 24, 1, 63--113. 1992.
[18]
T. Neumann, G. Weikum, RDF3X: a RISCstyle Engine for RDF, VLDB, Auckland, New Zealand, 2008.
[19]
T. Neumann, G. Weikum, Scalable join processing on very large RDF graphs, SIGMOD, 2009.
[20]
G. Piatetsky-Shapiro, C. Connell, Accurate Estimation of the Number of Tuples Satisfying a Condition. SIGMOD, 1984.
[21]
E. Prud'hommeaux, A. Seaborne, SPARQL Query Language for RDF, W3C Recommendation, 2008.
[22]
D. Schneider, D. DeWitt: A Performance Evaluation of Four Parallel Join Algorithms in a Shared-Nothing Multiprocessor Environment, SIGMOD, Portland, 1989.
[23]
D. Schneider, D. DeWitt, Tradeoffs in Processing Complex Join Queries via Hashing in Multiprocessor Database Machines, VLDB, Melbourne, Australia, 1990.
[24]
Semantic web challenge 2009. billion triples track. http://challenge.semanticweb.org/.
[25]
M.-E. Vidal, E. Ruckhaus, T. Lampo, A. Martinez, J. Sierra and A. Polleres, Efficiently Joining Group Patterns in SPARQL Queries, ESWC, 2010.
[26]
C. Weiss, P. Karras, A. Bernstein, Hexastore: Sextuple Indexing for Semantic Web Data Management, VLDB, 2008.
[27]
J. L. Wolf, D. M. Dias, P. S. Yu: An Effective Algorithm for Parallelizing Sort-Merge Joins in the Presence of Data Skew, DPDS, 1990.
[28]
H. J. Zeller, J. Gray, Adaptive Hash Joins for a Multiprogramming Environment, VLDB, Australia, 1990.

Cited By

View all
  • (2023)Using Machine Learning and Routing Protocols for Optimizing Distributed SPARQL Queries in CollaborationComputers10.3390/computers1210021012:10(210)Online publication date: 17-Oct-2023
  • (2023)Distributed SPARQL queries in collaboration with the routing protocolProceedings of the 27th International Database Engineered Applications Symposium10.1145/3589462.3589497(99-106)Online publication date: 5-May-2023
  • (2023)Chrontext: Portable SPARQL queries over contextualised time series data in industrial settingsExpert Systems with Applications10.1016/j.eswa.2023.120149226(120149)Online publication date: Sep-2023
  • Show More Cited By

Index Terms

  1. Parallelizing join computations of SPARQL queries for large semantic web databases

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SAC '11: Proceedings of the 2011 ACM Symposium on Applied Computing
    March 2011
    1868 pages
    ISBN:9781450301138
    DOI:10.1145/1982185
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 21 March 2011

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. RDF
    2. SPARQL
    3. parallel database
    4. semantic web

    Qualifiers

    • Research-article

    Conference

    SAC'11
    Sponsor:
    SAC'11: The 2011 ACM Symposium on Applied Computing
    March 21 - 24, 2011
    TaiChung, Taiwan

    Acceptance Rates

    Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

    Upcoming Conference

    SAC '25
    The 40th ACM/SIGAPP Symposium on Applied Computing
    March 31 - April 4, 2025
    Catania , Italy

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 23 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Using Machine Learning and Routing Protocols for Optimizing Distributed SPARQL Queries in CollaborationComputers10.3390/computers1210021012:10(210)Online publication date: 17-Oct-2023
    • (2023)Distributed SPARQL queries in collaboration with the routing protocolProceedings of the 27th International Database Engineered Applications Symposium10.1145/3589462.3589497(99-106)Online publication date: 5-May-2023
    • (2023)Chrontext: Portable SPARQL queries over contextualised time series data in industrial settingsExpert Systems with Applications10.1016/j.eswa.2023.120149226(120149)Online publication date: Sep-2023
    • (2020)In-memory parallelization of join queries over large ontological hierarchiesDistributed and Parallel Databases10.1007/s10619-020-07305-yOnline publication date: 29-Jun-2020
    • (2019)BTC-2019: The 2019 Billion Triple Challenge DatasetThe Semantic Web – ISWC 201910.1007/978-3-030-30796-7_11(163-180)Online publication date: 17-Oct-2019
    • (2018)Scalable RDF graph querying using cloud computingJournal of Web Engineering10.5555/2481562.248156812:1-2(159-180)Online publication date: 21-Dec-2018
    • (2018)TripleID-CProceedings of the International Conference on High Performance Computing in Asia-Pacific Region10.1145/3149457.3155322(261-270)Online publication date: 28-Jan-2018
    • (2018)TripleID-Q: RDF Query Processing Framework Using GPUIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2018.281456729:9(2121-2135)Online publication date: 1-Sep-2018
    • (2018)Practical parallel string matching framework for RDF entailments with GPUsInformation Systems Frontiers10.1007/s10796-016-9692-420:4(863-882)Online publication date: 24-Dec-2018
    • (2016)TripleID: A Low-Overhead Representation and Querying Using GPU for Large RDFsBeyond Databases, Architectures and Structures. Advanced Technologies for Data Mining and Knowledge Discovery10.1007/978-3-319-34099-9_31(400-415)Online publication date: 28-Apr-2016
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media