Wikidata:Scaling Wikidata/Benchmarking

From Wikidata
Jump to navigation Jump to search

The goal of the benchmarking project is to develop a query benchmark that can be used to determine how well QLever, and other SPARQL engines, performs on Wikidata and use it to make predictions on how well they will perform as Wikidata grows. The benchmark will cover all kinds of useful queries against Wikidata but will first concentrate on queries that are difficult for some SPARQL implementation.

Background

[edit]

One problem with the current Wikidata Query Service is that even some simple queries can timeout making it hard for users to interact with Wikidata. Queries related to the Wikidata ontology are particularly vulnerable to this problem as Blazegraph performs poorly on queries that include transitive closures in paths. For example, both

 SELECT ?c ?cLabel WHERE { 
   ?c wdt:P279* wd:Q8054 . 
   OPTIONAL { ?c rdfs:label ?cLabel . FILTER ( lang(?cLabel) = 'en' ) }
 }

and

 SELECT ?c ?cLabel WHERE { 
   ?c wdt:P279* wd:Q2424752 . 
 }

currently time out in the official Wikidata Query Service.

Other SPARQL services run much faster on these queries. For example, the QLever Wikidata service runs the first query in 2 seconds and the second query in under 1 second. This opens the possibility that other query services would be adequate to run queries against Wikidata for quite some time.

What matters most here is queries that are hard for SPARQL engines to evaluate on Wikidata, particularly if the difficulty is expected to increase as Wikidata grows. Queries that involve determining the transitive closure of the subclass of property, like the queries above, are known to be hard and their difficulty will increase as Wikidata grows so the initial benchmarks will concentrate on queries that involve the Wikidata ontology.

Method

[edit]

Phase 1

[edit]

We are gathering a collection of useful SPARQL queries against Wikidata and will run them on a local installation of QLever on a high-end consumer-grade computer, gathering information on the resources used to execute the queries, if the query runs successfully, or what resource was exhausted, if not. We will concentrate on queries that are hard for some SPARQL implementation but also include cheaper queries.

The initial source of queries is several sets of queries that were used by one of us to find problems related to the Wikidata ontology. Many of these queries are hard and time out in the Wikidata Query Service. We will also solicit input from Wikidata users

Some groups of queries will be parameterized to investigate changes in performance for different aspects of the involved data. For example, the ontology queries can be parameterized based on the size and depth of the subclass tree under and the number of instances of a concept.

Phase 2

[edit]

We will also run the queries we gather on other open-source SPARQL engines, including Blazegraph, Virtuoso, and MilleniumDB if they can load Wikidata on our machine.

Phase 3

[edit]

We will expand the query collection to include queries run in other benchmarks of SPARQL queries on Wikidata or queries derived from Scholia project.

Phase 4

[edit]

We will build extended versions of the Wikidata RDF dump, load them on the local installation of QLever, and run benchmarks on them, providing information on how QLever would work at Wikidata grows.

Progress

[edit]

Progress of the project is as below. Not all future activities are shown.

  • Obtain benchmarking hardware - completed except for extra SSDs
  • Build benchmarking machine - completed except for extra SSDs
  • Install Fedora 41 and other system software - completed

SPARQL Engines

[edit]
  • Build Blazegraph - completed in Docker
  • Prepare Wikidata for loading into Blazegraph - completed
  • Load Wikidata into Blazegraph
  • Run Blazegraph server on WIkidata

Build Benchmarks

[edit]

The benchmark harness, the benchmarks, and results of the benchmark runs are kept in https://github.com/wikius/benchmark-wikidata

  • Build synthetic benchmarks to test performance of local server - completed
  • Build benchmarks from queries used in class order - completed
  • Gather queries used in disjointness
  • Request queries from Wikidata community - in progress, good connection with Scholia
  • Build benchmarks from Wikidata community interactions

Run Benchmarks

[edit]
  • Revise existing benchmark harness to run on queries on multiple services - complete
  • Run first benchmark - subclasses - complete

Analyze Benchmarks

[edit]

Reporting

[edit]