research-article

Congenial Benchmarking of RDF Storage Solutions

Authors:

Axel-Cyrille Ngonga Ngomo,

Maximilian Pensel,

Anni-Yasmin TurhanAuthors Info & Claims

K-CAP '19: Proceedings of the 10th International Conference on Knowledge Capture

Pages 213 - 221

https://doi.org/10.1145/3360901.3364429

Published: 23 September 2019 Publication History

Abstract

Many SPARQL benchmark generation techniques rely on SPARQL query templates or on selecting representative queries from a set of input queries by inspecting their syntactic features. Hence, prototype queries from such benchmarks mainly capture combinations of SPARQL features, but not the semantics nor the conceptual association between queries. We present congenial benchmarks---a novel type of benchmark that can detect conceptual associations and thus reflect prototypical user intentions when selecting prototype queries. We study SPARROW, an instantiation of congenial benchmarks, where the conceptual associations of SPARQL queries are measured by concept similarity measures. To this end, we transform unary acyclic conjunctive SPARQL queries into ELH-description logic concepts. Our evaluation of three popular triple stores on two datasets shows that the benchmarks generated by SPARROW differ considerably from benchmarks generated using a feature-based approach. Moreover, our evaluation suggests that SPARROW can characterize the performance of common triple stores with respect to user needs by exploiting conceptual associations to detect prototypical user needs.

References

[1]

Gü nes Alucc, Olaf Hartig, M. Tamer Ö zsu, and Khuzaima Daudjee. 2014. Diversified Stress Testing of RDF Data Management Systems. In ISWC. 197--212.

[2]

Simon Bin, Lorenz Bühmann, Jens Lehmann, and Axel-Cyrille Ngonga Ngomo. 2016. Towards SPARQL-based Induction for Large-scale RDF Data Sets. In ECAI . IOS Press, 1551--1552.

[3]

Sylvain Brohee and Jacques Van Helden. 2006. Evaluation of clustering algorithms for protein-protein interaction networks. BMC bioinformatics, Vol. 7, 1 (2006), 488.

[4]

Felix Conrads, Jens Lehmann, Muhammad Saleem, Mohamed Morsey, and Axel-Cyrille Ngonga Ngomo. 2017. IGUANA: A Generic Framework for Benchmarking the Read-Write Performance of Triple Stores. In International Semantic Web Conference (ISWC) .

[5]

Renata Queiroz Dividino and Gerd Grö ner. 2013. Which of the following SPARQL Queries are Similar? Why?. In Proceedings of the First International Workshop on Linked Data for Information Extraction (LD4IE'13) (CEUR Workshop Proceedings), Vol. 1057. CEUR-WS.org.

[6]

Songyun Duan, Anastasios Kementsietsidis, Kavitha Srinivas, and Octavian Udrea. 2011. Apples and oranges: a comparison of RDF benchmarks and real RDF datasets. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data. ACM, 145--156.

Digital Library

[7]

Birte Glimm, Ian Horrocks, Boris Motik, Giorgos Stoilos, and Zhe Wang. 2014. HermiT: An OWL 2 Reasoner. J Autom Reasoning, Vol. 53, 3 (2014), 245--269.

[8]

Yuanbo Guo, Zhengxiang Pan, and Jeff Heflin. 2005. LUBM: A benchmark for OWL knowledge base systems. In Web Semantics, Vol. 3. Elsevier, 158--182.

[9]

Philipp Heim, Sebastian Hellmann, Jens Lehmann, Steffen Lohmann, and Timo Stegemann. 2009. RelFinder: Revealing relationships in RDF knowledge bases. In International Conference on Semantic and Digital Media Technologies. Springer, 182--187.

Digital Library

[10]

Paul Jaccard. 1901. Étude comparative de la distribution florale dans une portion des Alpes et des Jura. Bulletin de la Societe Vaudoise des Sciences Naturelles, Vol. 37 (1901), 547--579.

[11]

Karsten Lehmann and Anni-Yasmin Turhan. 2012. A Framework for Semantic-based Similarity Measures for $mathcalELH$-Concepts. In Proc. of the Europ. Conf. on Logics in AI. Springer, 307--319.

[12]

Mohamed Morsey, Jens Lehmann, Sören Auer, and Axel-Cyrille Ngonga Ngomo. 2011. DBpedia SPARQL Benchmark - Performance Assessment with Real Queries on Real Data. In International Semantic Web Conference, Vol. 7031. Springer Heidelberg, 454--469.

[13]

Shi Qiao and Z. Meral Ö zsoyoglu. 2015. RBench: Application-Specific RDF Benchmarking. In SIGMOD. ACM, 1825--1838. https://doi.org/10.1145/2723372.2746479

Digital Library

[14]

Jaime Salas and Aidan Hogan. 2018. Canonicalisation of monotone SPARQL queries. In International Semantic Web Conference. Springer, 600--616.

Digital Library

[15]

Muhammad Saleem, Muhammad Intizar Ali, Aidan Hogan, Qaiser Mehmood, and Axel-Cyrille Ngonga Ngomo. 2015a. LSQ: The linked sparql queries dataset. In ISWC. Springer, 261--269.

[16]

Muhammad Saleem, Ali Hasnainb, and Axel-Cyrille Ngonga Ngomo. 2017. LargeRDFBench: A Billion Triples Benchmark for SPARQL Endpoint Federation. In Journal of Web Semantics (JWS) .

[17]

Muhammad Saleem, Yasar Khan, Ali Hasnain, Ivan Ermilov, and Axel-Cyrille Ngonga Ngomo. 2015b. A fine-grained evaluation of SPARQL endpoint federation systems. Semantic Web (2015), 1--26.

[18]

Muhammad Saleem, Qaiser Mehmood, and Axel-Cyrille Ngonga Ngomo. 2015c. Feasible: A Feature-Based SPARQL Benchmark Generation Framework. In International Semantic Web Conference . Springer, 52--69.

[19]

Michael Schmidt, Olaf Görlitz, Peter Haase, Günter Ladwig, Andreas Schwarte, and Thanh Tran. 2011. FedBench: A Benchmark Suite for Federated Semantic Data Query Processing. In International Semantic Web Conference . 585--600.

[20]

Ahmet Soylu, Martin Giese, Ernesto Jimenez-Ruiz, Evgeny Kharlamov, Dmitriy Zheleznyakov, and Ian Horrocks. 2014. Towards exploiting query history for adaptive ontology-based visual query formulation. In Research Conference on Metadata and Semantics Research. Springer, 107--119.

[21]

Christina Unger, Corina Forascu, Vanessa Lopez, Axel-Cyrille Ngonga Ngomo, Elena Cabrio, Philipp Cimiano, and Sebastian Walter. 2014. Question answering over linked data (QALD-4). In Working Notes for CLEF Conf.

Index Terms

Congenial Benchmarking of RDF Storage Solutions
1. Information systems
  1. World Wide Web

Recommendations

Using the relation ontology Metarel for modelling Linked Data as multi-digraphs
Linked Data for Health Care and the Life Sciences

The Semantic Web standards OWL and RDF are often used to represent biomedical information as Linked Data; however, the OWL/RDF syntax, which combines both, was never optimised for querying. By combining two formal paradigms for modelling Linked Data, ...
A Comprehensive Java Benchmark Study on Memory and Garbage Collection Behavior of DaCapo, DaCapo Scala, and SPECjvm2008
ICPE '17: Proceedings of the 8th ACM/SPEC on International Conference on Performance Engineering

Benchmark suites are an indispensable part of scientific research to compare different approaches against each another. The diversity of benchmarks is an important asset to evaluate novel approaches for effectiveness and weaknesses. In this paper, we ...
Renaissance: benchmarking suite for parallel applications on the JVM
PLDI 2019: Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation

Established benchmark suites for the Java Virtual Machine (JVM), such as DaCapo, ScalaBench, and SPECjvm2008, lack workloads that take advantage of the parallel programming abstractions and concurrency primitives offered by the JVM and the Java Class ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

K-CAP '19: Proceedings of the 10th International Conference on Knowledge Capture

September 2019

281 pages

ISBN:9781450370080

DOI:10.1145/3360901

General Chairs:
Mayank Kejriwal
University of Southern California Information Sciences Institute, USA
,
Pedro Szekely
University of Southern California Information Sciences Institute, USA
,
Program Chair:
Raphaël Troncy
EURECOM, France

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGAI: ACM Special Interest Group on Artificial Intelligence

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 September 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Bundesministerium für Verkehr und Digitale Infrastruktur
Bundesministerium für Wirtschaft und Energie
Horizon 2020

Conference

K-CAP '19

Sponsor:

SIGAI

K-CAP '19: Knowledge Capture Conference

November 19 - 21, 2019

CA, Marina Del Rey, USA

Acceptance Rates

Overall Acceptance Rate 55 of 198 submissions, 28%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
85
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 10 Aug 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents