Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Join Ordering of SPARQL Property Path Queries

  • Conference paper
  • First Online:
The Semantic Web (ESWC 2023)

Abstract

SPARQL property path queries provide a succinct way to write complex navigational queries over RDF knowledge graphs. However, their evaluation remains difficult as they may involve the execution of transitive closures. As a result, many property path queries just time-out when executed on public online RDF knowledge graphs. One solution to speed up their execution is to find optimal join orders. Although the join ordering problem has been extensively studied for traditional SPARQL queries, the presence of property path patterns biases existing approaches. In this paper we focus on \(C2RPQ_{UF}\) queries (conjunctive SPARQL property path queries with UNION and FILTER), and we present a query optimizer that is able to capture the cost of \(C2RPQ_{UF}\) queries using an appropriate cost model and a sampling-based cardinality estimator. On the latest Wikidata Query Benchmark, we empirically demonstrate that our approach finds significantly better join orders than Virtuoso and BlazeGraph.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    True cardinalities were computed using SPARQL COUNT queries on the Wikidata SPARQL endpoint as of December 5, 2022.

  2. 2.

    https://docs.openlinksw.com/virtuoso/rdfsparqlimplementatiotrans.

  3. 3.

    https://docs.openlinksw.com/virtuoso/rdfperfcost.

  4. 4.

    https://github.com/blazegraph/database/wiki/QueryHints.

  5. 5.

    https://github.com/JulienDavat/Join-Ordering-of-SPARQL-Property-Path-Queries.

References

  1. Aimonier-Davat, J., Skaf-Molli, H., Molli, P.: Processing SPARQL property path queries online with web preemption. In: Verborgh, R., et al. (eds.) ESWC 2021. LNCS, vol. 12731, pp. 57–72. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-77385-4_4

    Chapter  Google Scholar 

  2. Ali, W., Saleem, M., Yao, B., Hogan, A., Ngomo, A.C.N.: A survey of RDF stores & SPARQL engines for querying knowledge graphs. VLDB J., 1–26 (2021)

    Google Scholar 

  3. Angles, R., Aranda, C.B., Hogan, A., Rojas, C., Vrgoč, D.: WDBench: a wikidata graph query benchmark. In: Angles, R., Aranda, C.B., Hogan, A., Rojas, C., Vrgoč, D., et al. (eds.) The Semantic Web—ISWC 2022. ISWC 2022. Lecture Notes in Computer Science, vol. 13489, pp. 714–731. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19433-7_41

  4. Arroyuelo, D., Hogan, A., Navarro, G., Rojas-Ledesma, J.: Time-and space-efficient regular path queries. In: 38th International Conference on Data Engineering (ICDE), pp. 3091–3105. IEEE (2022)

    Google Scholar 

  5. Bonifati, A., Martens, W., Timm, T.: Navigating the maze of wikidata query logs. In: The World Wide Web Conference, pp. 127–138 (2019)

    Google Scholar 

  6. Buil-Aranda, C., Hogan, A., Umbrich, J., Vandenbussche, P.-Y.: SPARQL web-querying infrastructure: ready for action? In: Alani, H., et al. (eds.) ISWC 2013. LNCS, vol. 8219, pp. 277–293. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41338-4_18

    Chapter  Google Scholar 

  7. Cluet, S., Moerkotte, G.: On the complexity of generating optimal left-deep processing trees with cross products. In: Gottlob, G., Vardi, M.Y. (eds.) ICDT 1995. LNCS, vol. 893, pp. 54–67. Springer, Heidelberg (1995). https://doi.org/10.1007/3-540-58907-4_6

    Chapter  Google Scholar 

  8. Erling, O., Mikhailov, I.: RDF support in the virtuoso DBMS. In: n: Pellegrini, T., Auer, S., Tochtermann, K., Schaffert, S. (eds.) Networked Knowledge - Networked Media. Studies in Computational Intelligence, vol. 221, pp. 7–24. Springer, Berlin, Heidelberg (2009). https://doi.org/10.1007/978-3-642-02184-8_2

  9. Fernández, J.D., Martínez-Prieto, M.A., Gutiérrez, C., Polleres, A., Arias, M.: Binary RDF representation for publication and exchange (HDT). J. Web Seman. 19, 22–41 (2013)

    Article  Google Scholar 

  10. Gubichev, A.: Query processing and optimization in graph databases. Ph.D. thesis, Technische Universität München (2015)

    Google Scholar 

  11. Gubichev, A., Bedathur, S.J., Seufert, S.: Sparqling kleene: fast property paths in RDF-3x. In: First International Workshop on Graph Data Management Experiences and Systems, pp. 1–7 (2013)

    Google Scholar 

  12. Gubichev, A., Neumann, T.: Exploiting the query structure for efficient join ordering in SPARQL queries. In: 17th International Conference on Extending Database Technology, EDBT (2014)

    Google Scholar 

  13. Hertzschuch, A., Hartmann, C., Habich, D., Lehner, W.: Simplicity done right for join ordering. In: CIDR (2021)

    Google Scholar 

  14. Jachiet, L., Genevès, P., Gesbert, N., Layaïda, N.: On the optimization of recursive relational queries: application to graph queries. In: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pp. 681–697 (2020)

    Google Scholar 

  15. Kader, R.A., Boncz, P.A., Manegold, S., van Keulen, M.: ROX: run-time optimization of XQueries. In: Çetintemel, U., Zdonik, S.B., Kossmann, D., Tatbul, N. (eds.) International Conference on Management of Data, SIGMOD. ACM (2009)

    Google Scholar 

  16. Kostylev, E.V., Reutter, J.L., Romero, M., Vrgoč, D.: SPARQL with property paths. In: Arenas, M., et al. (eds.) ISWC 2015. LNCS, vol. 9366, pp. 3–18. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25007-6_1

    Chapter  Google Scholar 

  17. Leis, V., Gubichev, A., Mirchev, A., Boncz, P.A., Kemper, A., Neumann, T.: How good are query optimizers, really? VLDB Endow. 9(3), 204–215 (2015)

    Article  Google Scholar 

  18. Leis, V., Radke, B., Gubichev, A., Kemper, A., Neumann, T.: Cardinality estimation done right: Index-based join sampling. In: CIDR (2017)

    Google Scholar 

  19. Li, F., Wu, B., Yi, K., Zhao, Z.: Wander join and XDB: online aggregation via random walks. ACM Trans. Database Syst. 44(1), 1–41 (2019). https://doi.org/10.1145/3284551

  20. Losemann, K., Martens, W.: The complexity of regular expressions and property paths in SPARQL. ACM Trans. Database Syst. (TODS) 38(4), 1–39 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  21. Malyshev, S., Krötzsch, M., González, L., Gonsior, J., Bielefeldt, A.: Getting the most out of Wikidata: semantic technology usage in Wikipedia’s knowledge graph. In: Vrandečić, D., et al. (eds.) ISWC 2018. LNCS, vol. 11137, pp. 376–394. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00668-6_23

    Chapter  Google Scholar 

  22. Neumann, T., Moerkotte, G.: Characteristic sets: accurate cardinality estimation for RDF queries with multiple joins. In: 27th International Conference on Data Engineering. IEEE (2011)

    Google Scholar 

  23. Park, Y., Ko, S., Bhowmick, S.S., Kim, K., Hong, K., Han, W.S.: G-care: a framework for performance benchmarking of cardinality estimation techniques for subgraph matching. In: International Conference on Management of Data (SIGMOD) (2020)

    Google Scholar 

  24. Pérez, J., Arenas, M., Gutiérrez, C.: Semantics and complexity of SPARQL. ACM Trans. Database Syst. 34(3), 1–45 (2009)

    Article  Google Scholar 

  25. Schmidt, M., Meier, M., Lausen, G.: Foundations of SPARQL query optimization. In: Database Theory - ICDT 2010, pp. 4–33 (2010)

    Google Scholar 

  26. Selingerl, P., Astrahan, M., Chamberlin, D., Lorie, R., Price, T.: Access path selection in a relational database management system. In: ACM SIGMOD (1979)

    Google Scholar 

  27. Sengupta, N., Bagchi, A., Ramanath, M., Bedathur, S.: Arrow: approximating reachability using random walks over web-scale graphs. In: International Conference on Data Engineering (ICDE), pp. 470–481. IEEE (2019)

    Google Scholar 

  28. Seufert, S., Anand, A., Bedathur, S., Weikum, G.: Ferrari: flexible and efficient reachability range assignment for graph indexing. In: 29th International Conference on Data Engineering (ICDE), pp. 1009–1020. IEEE (2013)

    Google Scholar 

  29. Stefanoni, G., Motik, B., Kostylev, E.V.: Estimating the cardinality of conjunctive queries over RDF data using graph summarisation. In: The World Wide Web Conference, pp. 1043–1052 (2018)

    Google Scholar 

  30. Steve, H., Andy, S.: SPARQL 1.1 query language. In: Recommendation W3C (2013)

    Google Scholar 

  31. Thompson, B., Personick, M., Cutcher, M.: The bigdata® RDF graph database. In: Linked Data Management, pp. 221–266. Chapman and Hall/CRC, Boca Raton (2016)

    Google Scholar 

  32. Wadhwa, S., Prasad, A., Ranu, S., Bagchi, A., Bedathur, S.: Efficiently answering regular simple path queries on large labeled networks. In: International Conference on Management of Data, pp. 1463–1480 (2019)

    Google Scholar 

  33. Yakovets, N., Godfrey, P., Gryz, J.: Query planning for evaluating SPARQL property paths. In: International Conference on Management of Data, pp. 1875–1889 (2016)

    Google Scholar 

Download references

Acknowledgments

This work is supported by the ANR project DeKaloG (Decentralized Knowledge Graphs), ANR-19-CE23-0014, CE23 - Intelligence artificielle, and the CominLabs project MikroLog (The Microdata Knowledge Graph).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Julien Aimonier-Davat .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Aimonier-Davat, J., Skaf-Molli, H., Molli, P., Dang, MH., Nédelec, B. (2023). Join Ordering of SPARQL Property Path Queries. In: Pesquita, C., et al. The Semantic Web. ESWC 2023. Lecture Notes in Computer Science, vol 13870. Springer, Cham. https://doi.org/10.1007/978-3-031-33455-9_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-33455-9_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-33454-2

  • Online ISBN: 978-3-031-33455-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics