ANAPSID: An Adaptive Query Processing Engine for SPARQL Endpoints

Acosta, Maribel; Vidal, Maria-Esther; Lampo, Tomas; Castillo, Julio; Ruckhaus, Edna

doi:10.1007/978-3-642-25073-6_2

Maribel Acosta²⁴,
Maria-Esther Vidal²⁴,
Tomas Lampo²⁵,
Julio Castillo²⁴ &
…
Edna Ruckhaus²⁴

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7031))

Included in the following conference series:

International Semantic Web Conference

3674 Accesses
84 Citations
3 Altmetric

Abstract

Following the design rules of Linked Data, the number of available SPARQL endpoints that support remote query processing is quickly growing; however, because of the lack of adaptivity, query executions may frequently be unsuccessful. First, fixed plans identified following the traditional optimize-then-execute paradigm, may timeout as a consequence of endpoint availability. Second, because blocking operators are usually implemented, endpoint query engines are not able to incrementally produce results, and may become blocked if data sources stop sending data. We present ANAPSID, an adaptive query engine for SPARQL endpoints that adapts query execution schedulers to data availability and run-time conditions. ANAPSID provides physical SPARQL operators that detect when a source becomes blocked or data traffic is bursty, and opportunistically, the operators produce results as quickly as data arrives from the sources. Additionally, ANAPSID operators implement main memory replacement policies to move previously computed matches to secondary memory avoiding duplicates. We compared ANAPSID performance with respect to RDF stores and endpoints, and observed that ANAPSID speeds up execution time, in some cases, in more than one order of magnitude.

Download to read the full chapter text

Chapter PDF

Caching and Prefetching Strategies for SPARQL Queries

FedSearch: Efficiently Combining Structured Queries and Full-Text Search in a SPARQL Federation

How Good Is Your SPARQL Endpoint?

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Atre, M., Chaoji, V., Zaki, M.J., Hendler, J.A.: Matrix ”Bit” loaded: a scalable lightweight join query processor for RDF data. In: Proceedings of the WWW, pp. 41–50 (2010)
Google Scholar
Basca, C., Bernstein, A.: Avalanche: Putting the Spirit of the Web back into Semantic Web Querying. In: The 6th International Workshop on SSWS at ISWC (2010)
Google Scholar
Blanco, E., Cardinale, Y., Vidal, M.-E.: A sampling-based approach to identify qos for web service orchestrations. In: iiWAS, pp. 25–32 (2010)
Google Scholar
Buil-Aranda, C., Arenas, M., Corcho, O.: Semantics and Optimization of the SPARQL 1.1 Federation Extension. In: Antoniou, G., Grobelnik, M., Simperl, E., Parsia, B., Plexousakis, D., De Leenheer, P., Pan, J. (eds.) ESWC 201. LNCS, vol. 6644, pp. 1–15. Springer, Heidelberg (2011)
Chapter Google Scholar
Deshpande, A., Ives, Z.G., Raman, V.: Adaptive query processing. Foundations and Trends in Databases 1(1), 1–140 (2007)
Article MATH Google Scholar
Florescu, D., Levy, A.Y., Manolescu, I., Suciu, D.: Query optimization in the presence of limited access patterns. In: SIGMOD Conference, pp. 311–322 (1999)
Google Scholar
Harth, A., Hose, K., Karnstedt, M., Polleres, A., Sattler, K.-U., Umbrich, J.: Data summaries for on-demand queries over linked data. In: WWW, pp. 411–420 (2010)
Google Scholar
Hartig, O.: Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution. In: Antoniou, G., Grobelnik, M., Simperl, E., Parsia, B., Plexousakis, D., De Leenheer, P., Pan, J. (eds.) ESWC 2011, Part I. LNCS, vol. 6643, pp. 154–169. Springer, Heidelberg (2011)
Chapter Google Scholar
Hartig, O., Bizer, C., Freytag, J.C.: Executing SPARQL Queries Over the Web of Linked Data. In: Bernstein, A., Karger, D.R., Heath, T., Feigenbaum, L., Maynard, D., Motta, E., Thirunarayan, K. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 293–309. Springer, Heidelberg (2009)
Chapter Google Scholar
Idreos, S., Kersten, M.L., Manegold, S.: Self-organizing tuple reconstruction in column-stores. In: SIGMOD Conference, pp. 297–308 (2009)
Google Scholar
Izquierdo, D., Vidal, M.-E., Bonet, B.: An Expressive and Efficient Solution to the Service Selection Problem. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part I. LNCS, vol. 6496, pp. 386–401. Springer, Heidelberg (2010)
Chapter Google Scholar
Jena TDB (2009), http://jena.hpl.hp.com/wiki/TDB
Kaoudi, Z., Kyzirakos, K., Koubarakis, M.: SPARQL Query Optimization on Top of DHTs. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part I. LNCS, vol. 6496, pp. 418–435. Springer, Heidelberg (2010)
Chapter Google Scholar
Ladwig, G., Tran, T.: Linked Data Query Processing Strategies. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part I. LNCS, vol. 6496, pp. 453–469. Springer, Heidelberg (2010)
Chapter Google Scholar
Ladwig, G., Tran, T.: SIHJoin: Querying Remote and Local Linked Data. In: Antoniou, G., Grobelnik, M., Simperl, E., Parsia, B., Plexousakis, D., De Leenheer, P., Pan, J. (eds.) ESWC 2011, Part I. LNCS, vol. 6643, pp. 139–153. Springer, Heidelberg (2011)
Chapter Google Scholar
Li, Y., Heflin, J.: Using Reformulation Trees to Optimize Queries Over Distributed Heterogeneous Sources. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part I. LNCS, vol. 6496, pp. 502–517. Springer, Heidelberg (2010)
Chapter Google Scholar
Neumann, T., Weikum, G.: Scalable join processing on very large rdf graphs. In: SIGMOD International Conference on Management of Data, pp. 627–640 (2009)
Google Scholar
Quilitz, B., Leser, U.: Querying Distributed RDF Data Sources with SPARQL. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 524–538. Springer, Heidelberg (2008)
Chapter Google Scholar
Harris, S., Andy Seaborne, E.P.: SPARQL 1.1 Query Language (June 2010)
Google Scholar
Stoker, M., Seaborne, A., Bernstein, A., Keifer, C., Reynolds, D.: SPARQL Basic Graph Pattern Optimizatin Using Selectivity Estimation. In: WWW (2008)
Google Scholar
Tran, T., Zhang, L., Studer, R.: Summary Models for Routing Keywords to Linked Data Sources. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part I. LNCS, vol. 6496, pp. 781–797. Springer, Heidelberg (2010)
Chapter Google Scholar
Urhan, T., Franklin, M.J.: Xjoin: A reactively-scheduled pipelined join operator. IEEE Data Eng. Bull. 23(2), 27–33 (2000)
Google Scholar
Urhan, T., Franklin, M.J., Amsaleg, L.: Cost based query scrambling for initial delays. In: SIGMOD Conference, pp. 130–141 (1998)
Google Scholar
Vidal, M.-E., Ruckhaus, E., Lampo, T., Martínez, A., Sierra, J., Polleres, A.: Efficiently Joining Group Patterns in SPARQL Queries. In: Aroyo, L., Antoniou, G., Hyvönen, E., ten Teije, A., Stuckenschmidt, H., Cabral, L., Tudorache, T. (eds.) ESWC 2010. LNCS, vol. 6088, pp. 228–242. Springer, Heidelberg (2010)
Chapter Google Scholar
Weiss, C., Karras, P., Bernstein, A.: Hexastore: sextuple indexing for semantic web data management. PVLDB 1(1), 1008–1019 (2008)
Google Scholar
Wiederhold, G.: Mediators in the architecture of future information systems. IEEE Computer 25(3), 38–49 (1992)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Universidad Simón Bolívar, Caracas, Venezuela
Maribel Acosta, Maria-Esther Vidal, Julio Castillo & Edna Ruckhaus
University of Maryland, College Park, USA
Tomas Lampo

Authors

Maribel Acosta
View author publications
You can also search for this author in PubMed Google Scholar
Maria-Esther Vidal
View author publications
You can also search for this author in PubMed Google Scholar
Tomas Lampo
View author publications
You can also search for this author in PubMed Google Scholar
Julio Castillo
View author publications
You can also search for this author in PubMed Google Scholar
Edna Ruckhaus
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science Dept., VU University Amsterdam, De Boelelaan 1081, 1081 HV, Amsterdam, The Netherlands
Lora Aroyo
IBM Research, 10598, Yorktown Heights, NY, USA
Chris Welty
The Open University, Walton Hall, MK7 6AA, Milton Keynes, UK
Harith Alani
Google, USA
Jamie Taylor
University of Zurich, Binzmuehlestrasse 14, 8050, Zurich, Switzerland
Abraham Bernstein
Massachusetts Institute of Technology, 32 Vassar Street, 02139, Cambridge, MA, USA
Lalana Kagal
Stanford University, 94305, Stanford, CA, USA
Natasha Noy
Linköping University, 581 83, Linköping, Sweden
Eva Blomqvist

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Acosta, M., Vidal, ME., Lampo, T., Castillo, J., Ruckhaus, E. (2011). ANAPSID: An Adaptive Query Processing Engine for SPARQL Endpoints. In: Aroyo, L., et al. The Semantic Web – ISWC 2011. ISWC 2011. Lecture Notes in Computer Science, vol 7031. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25073-6_2

Download citation

DOI: https://doi.org/10.1007/978-3-642-25073-6_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25072-9
Online ISBN: 978-3-642-25073-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

ANAPSID: An Adaptive Query Processing Engine for SPARQL Endpoints

Abstract

Chapter PDF

Similar content being viewed by others

Caching and Prefetching Strategies for SPARQL Queries

FedSearch: Efficiently Combining Structured Queries and Full-Text Search in a SPARQL Federation

How Good Is Your SPARQL Endpoint?

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

ANAPSID: An Adaptive Query Processing Engine for SPARQL Endpoints

Abstract

Chapter PDF

Similar content being viewed by others

Caching and Prefetching Strategies for SPARQL Queries

FedSearch: Efficiently Combining Structured Queries and Full-Text Search in a SPARQL Federation

How Good Is Your SPARQL Endpoint?

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation