research-article

Parallelizing join computations of SPARQL queries for large semantic web databases

Authors:

Jinghua Groppe,

Sven GroppeAuthors Info & Claims

SAC '11: Proceedings of the 2011 ACM Symposium on Applied Computing

Pages 1681 - 1686

https://doi.org/10.1145/1982185.1982536

Published: 21 March 2011 Publication History

Abstract

While a number of optimizing techniques have been developed to efficiently process increasing large Semantic Web databases, these optimization approaches have not fully leveraged the powerful computation capability of modern computers. Today's multi-core computers promise an enormous performance boost by providing a parallel computing platform. Although the parallel relational database systems have been well built, parallel query computing in Semantic Web databases have not extensively been studied. In this work, we develop the parallel algorithms for join computations of SPARQL queries. Our performance study shows that the parallel computation of SPARQL queries significantly speeds up querying large Semantic Web databases.

References

[1]

G. Amdahl: Validity of the Single Processor Approach to Achieving Large-Scale Computing Capabilities. AFIPS Conference Proceedings, (30), pp. 483--485, 1967.

Digital Library

[2]

D. Beckett (editor), RDF/XML Syntax Specification (Revised), W3C Recommendation, 10th February 2004.

[3]

H. Boral, W. Alexander, L. Clay, G. Copeland, S. Danforth, M. Franklin, B. Hart, M. Smith, P. Valduriez: Prototyping Budda: a highly parallel database system. IEEE KDE, 1990.

Digital Library

[4]

D. J. DeWitt, R. H. Gerber, G. Graefe, M. L. Heytens, K. B. Kumar. GAMMA - A High Performance Dataflow Database Machine, VLDB, 1986.

Digital Library

[5]

D. DeWitt, J. Gray: Parallel Database Systems: The Future of High Performance Database Systems. Communications of ACM, 35(6): 85--98, June 1992.

Digital Library

[6]

M. Dürst, M. Suignard, Internationalized Resource Identifiers (IRIs), http://www.ietf.org/rfc/rfc3987.txt, W3C Memo, 2005.

[7]

L. Feigenbaum (editor), DAWG Testcases, http://www.w3.org/2001/sw/DataAccess/tests/r2, 2008.

[8]

G. Graefe, Query evaluation techniques for large database. ACM Computing Surveys 25, 2 (June), 73--170, 1993.

Digital Library

[9]

J. Groppe, S. Groppe, S. Ebers and V. Linnemann, Efficient Processing of SPARQL Joins in Memory by Dynamically Restricting Triple Patterns, ACM SAC, 2009.

Digital Library

[10]

J. Groppe, S. Groppe, A. Schleifer, Volker Linnemann, LuposDate: A Semantic Web Database System, ACM CIKM, Hong Kong, China, 2009.

Digital Library

[11]

S. Groppe, J. Groppe, External Sorting for Index Construction of Large Semantic Web Databases, ACM SAC, 2010.

Digital Library

[12]

S. Groppe, J. Groppe, LUPOSDATE Demonstration, http://www.ifis.uni-luebeck.de/index.php?id=luposdate-demo, 2009.

[13]

A. Harth, J. Umbrich, A. Hogan, S. Decker. YARS2: A Federated Repository for Querying Graph Structured Data from the Web, ISWC, 2007.

Digital Library

[14]

C. A. R. Hoare: Monitors: An Operating System Structuring Concept. Commun. ACM 17(10): 549--557, 1974.

Digital Library

[15]

M. Kitsuregawa, H. Tanaka, T. Motooka: Application of Hash to Data Base Machine and Its Architecture, New Generation Computing, Vol. 1, No. 1, 1983.

[16]

M. Kitsuregawa, Y. Ogawa: A New Parallel Hash Join Method with Robustness for Data Skew in Super Database Computer (SDC), VLDB, Melbourne, Australia, 1990.

[17]

P. Mishra, M. Eich, Join processing in relational databases, ACM Computing Surveys 24, 1, 63--113. 1992.

Digital Library

[18]

T. Neumann, G. Weikum, RDF3X: a RISCstyle Engine for RDF, VLDB, Auckland, New Zealand, 2008.

Digital Library

[19]

T. Neumann, G. Weikum, Scalable join processing on very large RDF graphs, SIGMOD, 2009.

Digital Library

[20]

G. Piatetsky-Shapiro, C. Connell, Accurate Estimation of the Number of Tuples Satisfying a Condition. SIGMOD, 1984.

Digital Library

[21]

E. Prud'hommeaux, A. Seaborne, SPARQL Query Language for RDF, W3C Recommendation, 2008.

[22]

D. Schneider, D. DeWitt: A Performance Evaluation of Four Parallel Join Algorithms in a Shared-Nothing Multiprocessor Environment, SIGMOD, Portland, 1989.

Digital Library

[23]

D. Schneider, D. DeWitt, Tradeoffs in Processing Complex Join Queries via Hashing in Multiprocessor Database Machines, VLDB, Melbourne, Australia, 1990.

Digital Library

[24]

Semantic web challenge 2009. billion triples track. http://challenge.semanticweb.org/.

[25]

M.-E. Vidal, E. Ruckhaus, T. Lampo, A. Martinez, J. Sierra and A. Polleres, Efficiently Joining Group Patterns in SPARQL Queries, ESWC, 2010.

Digital Library

[26]

C. Weiss, P. Karras, A. Bernstein, Hexastore: Sextuple Indexing for Semantic Web Data Management, VLDB, 2008.

Digital Library

[27]

J. L. Wolf, D. M. Dias, P. S. Yu: An Effective Algorithm for Parallelizing Sort-Merge Joins in the Presence of Data Skew, DPDS, 1990.

Digital Library

[28]

H. J. Zeller, J. Gray, Adaptive Hash Joins for a Multiprogramming Environment, VLDB, Australia, 1990.

Cited By

Warnke BFischer SGroppe S(2023)Using Machine Learning and Routing Protocols for Optimizing Distributed SPARQL Queries in CollaborationComputers10.3390/computers1210021012:10(210)Online publication date: 17-Oct-2023
https://doi.org/10.3390/computers12100210
Warnke BFischer SGroppe S(2023)Distributed SPARQL queries in collaboration with the routing protocolProceedings of the 27th International Database Engineered Applications Symposium10.1145/3589462.3589497(99-106)Online publication date: 5-May-2023
https://dl.acm.org/doi/10.1145/3589462.3589497
Bakken MSoylu A(2023)Chrontext: Portable SPARQL queries over contextualised time series data in industrial settingsExpert Systems with Applications10.1016/j.eswa.2023.120149226(120149)Online publication date: Sep-2023
https://doi.org/10.1016/j.eswa.2023.120149
Show More Cited By

Index Terms

Parallelizing join computations of SPARQL queries for large semantic web databases
1. Information systems
  1. Data management systems
    1. Database management system engines
      1. Parallel and distributed DBMSs

Recommendations

Accelerating large semantic web databases by parallel join computations of SPARQL queries

While a number of optimizing techniques have been developed to efficiently process increasing large Semantic Web databases, these optimization approaches have not fully leveraged the powerful computation capability of modern computers. Today's multi-...
RDF, Jena, SparQL and the 'Semantic Web'
SIGUCCS '09: Proceedings of the 37th annual ACM SIGUCCS fall conference: communication and collaboration

The Resource Description Format (RDF) is used to represent information modeled as a "graph": a set of individual objects, along with a set of connections among those objects. In that role, RDF is one of the pillars of the so-called Semantic Web. This ...
Querying semantic web data with SPARQL
PODS '11: Proceedings of the thirtieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems

The Semantic Web is the initiative of the W3C to make information on the Web readable not only by humans but also by machines. RDF is the data model for Semantic Web data, and SPARQL is the standard query language for this data model. In the last ten ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SAC '11: Proceedings of the 2011 ACM Symposium on Applied Computing

March 2011

1868 pages

ISBN:9781450301138

DOI:10.1145/1982185

Conference Chairs:
William Chu
Tunghai University, TaiChung, Taiwan
,
W. Eric Wong
University of Texas at Dallas, Richardson, Texas
,
Program Chairs:
Mathew J. Palakal
Indiana University Purdue University, Indianapolis
,
Chih-Cheng Hung
Southern Polytechnic State University, Marietta

Copyright © 2011 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGAPP: ACM Special Interest Group on Applied Computing

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 March 2011

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

SAC'11

Sponsor:

SIGAPP

SAC'11: The 2011 ACM Symposium on Applied Computing

March 21 - 24, 2011

TaiChung, Taiwan

Acceptance Rates

Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

Upcoming Conference

SAC '25

Sponsor:
sigapp

The 40th ACM/SIGAPP Symposium on Applied Computing

March 31 - April 4, 2025

Catania , Italy

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

19
Total Citations
View Citations
357
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)0

Reflects downloads up to 23 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Warnke BFischer SGroppe S(2023)Using Machine Learning and Routing Protocols for Optimizing Distributed SPARQL Queries in CollaborationComputers10.3390/computers1210021012:10(210)Online publication date: 17-Oct-2023
https://doi.org/10.3390/computers12100210
Warnke BFischer SGroppe S(2023)Distributed SPARQL queries in collaboration with the routing protocolProceedings of the 27th International Database Engineered Applications Symposium10.1145/3589462.3589497(99-106)Online publication date: 5-May-2023
https://dl.acm.org/doi/10.1145/3589462.3589497
Bakken MSoylu A(2023)Chrontext: Portable SPARQL queries over contextualised time series data in industrial settingsExpert Systems with Applications10.1016/j.eswa.2023.120149226(120149)Online publication date: Sep-2023
https://doi.org/10.1016/j.eswa.2023.120149
Bilidas DKoubarakis M(2020)In-memory parallelization of join queries over large ontological hierarchiesDistributed and Parallel Databases10.1007/s10619-020-07305-yOnline publication date: 29-Jun-2020
https://doi.org/10.1007/s10619-020-07305-y
Herrera JHogan AKäfer T(2019)BTC-2019: The 2019 Billion Triple Challenge DatasetThe Semantic Web – ISWC 201910.1007/978-3-030-30796-7_11(163-180)Online publication date: 17-Oct-2019
https://doi.org/10.1007/978-3-030-30796-7_11
Li RYang DHu HXie JFu L(2018)Scalable RDF graph querying using cloud computingJournal of Web Engineering10.5555/2481562.248156812:1-2(159-180)Online publication date: 21-Dec-2018
https://dl.acm.org/doi/10.5555/2481562.2481568
Chantrapornchai CMakpaisit P(2018)TripleID-CProceedings of the International Conference on High Performance Computing in Asia-Pacific Region10.1145/3149457.3155322(261-270)Online publication date: 28-Jan-2018
https://dl.acm.org/doi/10.1145/3149457.3155322
Chantrapornchai CChoksuchat C(2018)TripleID-Q: RDF Query Processing Framework Using GPUIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2018.281456729:9(2121-2135)Online publication date: 1-Sep-2018
https://doi.org/10.1109/TPDS.2018.2814567
Choksuchat CChantrapornchai C(2018)Practical parallel string matching framework for RDF entailments with GPUsInformation Systems Frontiers10.1007/s10796-016-9692-420:4(863-882)Online publication date: 24-Dec-2018
https://dl.acm.org/doi/10.1007/s10796-016-9692-4
Chantrapornchai CChoksuchat CHaidl MGorlatch S(2016)TripleID: A Low-Overhead Representation and Querying Using GPU for Large RDFsBeyond Databases, Architectures and Structures. Advanced Technologies for Data Mining and Knowledge Discovery10.1007/978-3-319-34099-9_31(400-415)Online publication date: 28-Apr-2016
https://doi.org/10.1007/978-3-319-34099-9_31
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents