Abstract
SPARQL property path queries provide a succinct way to write complex navigational queries over RDF knowledge graphs. However, their evaluation remains difficult as they may involve the execution of transitive closures. As a result, many property path queries just time-out when executed on public online RDF knowledge graphs. One solution to speed up their execution is to find optimal join orders. Although the join ordering problem has been extensively studied for traditional SPARQL queries, the presence of property path patterns biases existing approaches. In this paper we focus on \(C2RPQ_{UF}\) queries (conjunctive SPARQL property path queries with UNION and FILTER), and we present a query optimizer that is able to capture the cost of \(C2RPQ_{UF}\) queries using an appropriate cost model and a sampling-based cardinality estimator. On the latest Wikidata Query Benchmark, we empirically demonstrate that our approach finds significantly better join orders than Virtuoso and BlazeGraph.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
True cardinalities were computed using SPARQL COUNT queries on the Wikidata SPARQL endpoint as of December 5, 2022.
- 2.
- 3.
- 4.
- 5.
References
Aimonier-Davat, J., Skaf-Molli, H., Molli, P.: Processing SPARQL property path queries online with web preemption. In: Verborgh, R., et al. (eds.) ESWC 2021. LNCS, vol. 12731, pp. 57–72. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-77385-4_4
Ali, W., Saleem, M., Yao, B., Hogan, A., Ngomo, A.C.N.: A survey of RDF stores & SPARQL engines for querying knowledge graphs. VLDB J., 1–26 (2021)
Angles, R., Aranda, C.B., Hogan, A., Rojas, C., Vrgoč, D.: WDBench: a wikidata graph query benchmark. In: Angles, R., Aranda, C.B., Hogan, A., Rojas, C., Vrgoč, D., et al. (eds.) The Semantic Web—ISWC 2022. ISWC 2022. Lecture Notes in Computer Science, vol. 13489, pp. 714–731. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19433-7_41
Arroyuelo, D., Hogan, A., Navarro, G., Rojas-Ledesma, J.: Time-and space-efficient regular path queries. In: 38th International Conference on Data Engineering (ICDE), pp. 3091–3105. IEEE (2022)
Bonifati, A., Martens, W., Timm, T.: Navigating the maze of wikidata query logs. In: The World Wide Web Conference, pp. 127–138 (2019)
Buil-Aranda, C., Hogan, A., Umbrich, J., Vandenbussche, P.-Y.: SPARQL web-querying infrastructure: ready for action? In: Alani, H., et al. (eds.) ISWC 2013. LNCS, vol. 8219, pp. 277–293. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41338-4_18
Cluet, S., Moerkotte, G.: On the complexity of generating optimal left-deep processing trees with cross products. In: Gottlob, G., Vardi, M.Y. (eds.) ICDT 1995. LNCS, vol. 893, pp. 54–67. Springer, Heidelberg (1995). https://doi.org/10.1007/3-540-58907-4_6
Erling, O., Mikhailov, I.: RDF support in the virtuoso DBMS. In: n: Pellegrini, T., Auer, S., Tochtermann, K., Schaffert, S. (eds.) Networked Knowledge - Networked Media. Studies in Computational Intelligence, vol. 221, pp. 7–24. Springer, Berlin, Heidelberg (2009). https://doi.org/10.1007/978-3-642-02184-8_2
Fernández, J.D., Martínez-Prieto, M.A., Gutiérrez, C., Polleres, A., Arias, M.: Binary RDF representation for publication and exchange (HDT). J. Web Seman. 19, 22–41 (2013)
Gubichev, A.: Query processing and optimization in graph databases. Ph.D. thesis, Technische Universität München (2015)
Gubichev, A., Bedathur, S.J., Seufert, S.: Sparqling kleene: fast property paths in RDF-3x. In: First International Workshop on Graph Data Management Experiences and Systems, pp. 1–7 (2013)
Gubichev, A., Neumann, T.: Exploiting the query structure for efficient join ordering in SPARQL queries. In: 17th International Conference on Extending Database Technology, EDBT (2014)
Hertzschuch, A., Hartmann, C., Habich, D., Lehner, W.: Simplicity done right for join ordering. In: CIDR (2021)
Jachiet, L., Genevès, P., Gesbert, N., Layaïda, N.: On the optimization of recursive relational queries: application to graph queries. In: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pp. 681–697 (2020)
Kader, R.A., Boncz, P.A., Manegold, S., van Keulen, M.: ROX: run-time optimization of XQueries. In: Çetintemel, U., Zdonik, S.B., Kossmann, D., Tatbul, N. (eds.) International Conference on Management of Data, SIGMOD. ACM (2009)
Kostylev, E.V., Reutter, J.L., Romero, M., Vrgoč, D.: SPARQL with property paths. In: Arenas, M., et al. (eds.) ISWC 2015. LNCS, vol. 9366, pp. 3–18. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25007-6_1
Leis, V., Gubichev, A., Mirchev, A., Boncz, P.A., Kemper, A., Neumann, T.: How good are query optimizers, really? VLDB Endow. 9(3), 204–215 (2015)
Leis, V., Radke, B., Gubichev, A., Kemper, A., Neumann, T.: Cardinality estimation done right: Index-based join sampling. In: CIDR (2017)
Li, F., Wu, B., Yi, K., Zhao, Z.: Wander join and XDB: online aggregation via random walks. ACM Trans. Database Syst. 44(1), 1–41 (2019). https://doi.org/10.1145/3284551
Losemann, K., Martens, W.: The complexity of regular expressions and property paths in SPARQL. ACM Trans. Database Syst. (TODS) 38(4), 1–39 (2013)
Malyshev, S., Krötzsch, M., González, L., Gonsior, J., Bielefeldt, A.: Getting the most out of Wikidata: semantic technology usage in Wikipedia’s knowledge graph. In: Vrandečić, D., et al. (eds.) ISWC 2018. LNCS, vol. 11137, pp. 376–394. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00668-6_23
Neumann, T., Moerkotte, G.: Characteristic sets: accurate cardinality estimation for RDF queries with multiple joins. In: 27th International Conference on Data Engineering. IEEE (2011)
Park, Y., Ko, S., Bhowmick, S.S., Kim, K., Hong, K., Han, W.S.: G-care: a framework for performance benchmarking of cardinality estimation techniques for subgraph matching. In: International Conference on Management of Data (SIGMOD) (2020)
Pérez, J., Arenas, M., Gutiérrez, C.: Semantics and complexity of SPARQL. ACM Trans. Database Syst. 34(3), 1–45 (2009)
Schmidt, M., Meier, M., Lausen, G.: Foundations of SPARQL query optimization. In: Database Theory - ICDT 2010, pp. 4–33 (2010)
Selingerl, P., Astrahan, M., Chamberlin, D., Lorie, R., Price, T.: Access path selection in a relational database management system. In: ACM SIGMOD (1979)
Sengupta, N., Bagchi, A., Ramanath, M., Bedathur, S.: Arrow: approximating reachability using random walks over web-scale graphs. In: International Conference on Data Engineering (ICDE), pp. 470–481. IEEE (2019)
Seufert, S., Anand, A., Bedathur, S., Weikum, G.: Ferrari: flexible and efficient reachability range assignment for graph indexing. In: 29th International Conference on Data Engineering (ICDE), pp. 1009–1020. IEEE (2013)
Stefanoni, G., Motik, B., Kostylev, E.V.: Estimating the cardinality of conjunctive queries over RDF data using graph summarisation. In: The World Wide Web Conference, pp. 1043–1052 (2018)
Steve, H., Andy, S.: SPARQL 1.1 query language. In: Recommendation W3C (2013)
Thompson, B., Personick, M., Cutcher, M.: The bigdata® RDF graph database. In: Linked Data Management, pp. 221–266. Chapman and Hall/CRC, Boca Raton (2016)
Wadhwa, S., Prasad, A., Ranu, S., Bagchi, A., Bedathur, S.: Efficiently answering regular simple path queries on large labeled networks. In: International Conference on Management of Data, pp. 1463–1480 (2019)
Yakovets, N., Godfrey, P., Gryz, J.: Query planning for evaluating SPARQL property paths. In: International Conference on Management of Data, pp. 1875–1889 (2016)
Acknowledgments
This work is supported by the ANR project DeKaloG (Decentralized Knowledge Graphs), ANR-19-CE23-0014, CE23 - Intelligence artificielle, and the CominLabs project MikroLog (The Microdata Knowledge Graph).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Aimonier-Davat, J., Skaf-Molli, H., Molli, P., Dang, MH., Nédelec, B. (2023). Join Ordering of SPARQL Property Path Queries. In: Pesquita, C., et al. The Semantic Web. ESWC 2023. Lecture Notes in Computer Science, vol 13870. Springer, Cham. https://doi.org/10.1007/978-3-031-33455-9_3
Download citation
DOI: https://doi.org/10.1007/978-3-031-33455-9_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-33454-2
Online ISBN: 978-3-031-33455-9
eBook Packages: Computer ScienceComputer Science (R0)