Executing SPARQL queries over the web of linked data

O Hartig, C Bizer, JC Freytag - The Semantic Web-ISWC 2009: 8th …, 2009 - Springer
The Semantic Web-ISWC 2009: 8th International Semantic Web Conference, ISWC …, 2009Springer
Abstract The Web of Linked Data forms a single, globally distributed dataspace. Due to the
openness of this dataspace, it is not possible to know in advance all data sources that might
be relevant for query answering. This openness poses a new challenge that is not
addressed by traditional research on federated query processing. In this paper we present
an approach to execute SPARQL queries over the Web of Linked Data. The main idea of our
approach is to discover data that might be relevant for answering a query during the query …
Abstract
The Web of Linked Data forms a single, globally distributed dataspace. Due to the openness of this dataspace, it is not possible to know in advance all data sources that might be relevant for query answering. This openness poses a new challenge that is not addressed by traditional research on federated query processing. In this paper we present an approach to execute SPARQL queries over the Web of Linked Data. The main idea of our approach is to discover data that might be relevant for answering a query during the query execution itself. This discovery is driven by following RDF links between data sources based on URIs in the query and in partial results. The URIs are resolved over the HTTP protocol into RDF data which is continuously added to the queried dataset. This paper describes concepts and algorithms to implement our approach using an iterator-based pipeline. We introduce a formalization of the pipelining approach and show that classical iterators may cause blocking due to the latency of HTTP requests. To avoid blocking, we propose an extension of the iterator paradigm. The evaluation of our approach shows its strengths as well as the still existing challenges.
Springer