Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2213556.2213573acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
research-article

The complexity of evaluating path expressions in SPARQL

Published: 21 May 2012 Publication History

Abstract

The World Wide Web Consortium (W3C) recently introduced property paths in SPARQL 1.1, a query language for RDF data. Property paths allow SPARQL queries to evaluate regular expressions over graph data. However, they differ from standard regular expressions in several notable aspects. For example, they have a limited form of negation, they have numerical occurrence indicators as syntactic sugar, and their semantics on graphs is defined in a non-standard manner. We formalize the W3C semantics of property paths and investigate various query evaluation problems on graphs. More specifically, let x and y be two nodes in an edge-labeled graph and r be an expression. We study the complexities of (1) deciding whether there exists a path from x to y that matches r and (2) counting how many paths from x to y match r. Our main results show that, compared to an alternative semantics of regular expressions on graphs, the complexity of (1) and (2) under W3C semantics is significantly higher. Whereas the alternative semantics remains in polynomial time for large fragments of expressions, the W3C semantics makes problems (1) and (2) intractable almost immediately.
As a side-result, we prove that the membership problem for regular expressions with numerical occurrence indicators and negation is in polynomial time.

References

[1]
S. Abiteboul, D. Quass, J. McHugh, J. Widom, and J. L. Wiener. The Lorel query language for semistructured data. Int. J. on Digital Libraries, 1(1):68--88, 1997.
[2]
S. Abiteboul and V. Vianu. Regular path queries with constraints. J. Comput. Syst. Sci., 58(3):428--452, 1999.
[3]
F. Alkhateeb, J.-F. Baget, and J. Euzenat. Extending SPARQL with regular expression patterns (for querying RDF). J. Web Sem., 7(2):57--73, 2009.
[4]
C. Álvarez and B. Jenner. A very hard log-space counting class. Theor. Comput. Sci., 107:3--30, 1993.
[5]
M. Arenas, S. Conca, and J. Pérez. Counting beyond a yottabyte, or how SPARQL 1.1 property paths will prevent the adoption of the standard. In World Wide Web Conference (WWW), 2012. To appear.
[6]
M. Arenas and J. Pérez. Querying semantic web data with SPARQL. In Principles of Database Systems (PODS), p. 305--316, 2011.
[7]
C. Berge. Graphs and Hypergraphs. North-Holland Publishing Company, 1973.
[8]
G. J. Bex, F. Neven, T. Schwentick, and S. Vansummeren. Inference of concise regular expressions and DTDs. ACM Trans. Database Syst., 2010.
[9]
R. Book, S. Even, S. Greibach, and G. Ott. Ambiguity in graphs and expressions. IEEE Trans. Comput., 20:149--153, 1971.
[10]
P. Buneman, S. B. Davidson, G. G. Hillebrand, and D. Suciu. A query language and optimization techniques for unstructured data. In SIGMOD Conference, p. 505--516, 1996.
[11]
D. Calvanese, G. De Giacomo, M. Lenzerini, and M. Y. Vardi. Containment of conjunctive regular path queries with inverse. In Principles of Knowledge Representation and Reasoning (KR), p. 176--185, 2000.
[12]
D. Calvanese, G. De Giacomo, M. Lenzerini, and M. Y. Vardi. View-based query processing for regular path queries with inverse. In Principles of Database Systems (PODS), pages 58--66, 2000.
[13]
D. Calvanese, G. De Giacomo, M. Lenzerini, and M.Y. Vardi. Rewriting of regular expressions and regular path queries. J. Comput. Syst. Sci., 64(3):443--465, 2002.
[14]
D. Colazzo, G. Ghelli, and C. Sartiani. Efficient asymmetric inclusion between regular expression types. In International Conference on Database Theory (ICDT), pages 174--182, 2009.
[15]
D. Colazzo, G. Ghelli, and C. Sartiani. Efficient inclusion for a class of XML types with interleaving and counting. Information Systems, 34(7):643--656, 2009.
[16]
M. P. Consens and A. O. Mendelzon. GraphLog: a visual formalism for real life recursion. In Principles of Database Systems (PODS), p. 404--416, 1990.
[17]
I. F. Cruz, A. O. Mendelzon, and P. T. Wood. A graphical query language supporting recursion. In SIGMOD Conference, p. 323--330, 1987.
[18]
A. Deutsch and V. Tannen. Optimization properties for classes of conjunctive regular path queries. In Database Programming Languages (DBPL), p. 21--39, 2001.
[19]
M. F. Fernández, D. Florescu, A. Y. Levy, and D. Suciu. Declarative specification of web sites with strudel. VLDB J., 9(1):38--55, 2000.
[20]
D. Florescu, A. Y. Levy, and D. Suciu. Query containment for conjunctive queries with regular expressions. In Principles of Database Systems (PODS), p. 139--148, 1998.
[21]
S. Gao, C. M. Sperberg-McQueen, H.S. Thompson, N. Mendelsohn, D. Beech, and M. Maloney. W3C XML Schema Definition Language (XSD) 1.1 part 1: Structures. Tech. report, World Wide Web Consortium, April 2009.
[22]
W. Gelade, M. Gyssens, and W. Martens. Regular expressions with counting: Weak versus strong determinism. SIAM J. Comput., 41(1):160--190, 2012.
[23]
W. Gelade, W. Martens, and F. Neven. Optimizing schema languages for XML: Numerical constraints and interleaving. SIAM J. Comput., 38(5), 2009.
[24]
V. M. Glushkov. The abstract theory of automata. Russian Math. Surveys, 16(5(101)):1--53, 1961.
[25]
S. Harris and A. Seaborne. SPARQL 1.1 query language. Tech. report, World Wide Web Consortium (W3C), January2012.
[26]
J.E. Hopcroft and J.D. Ullman. Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, 1979.
[27]
S. Kannan, Z. Sweedyk, and S. R. Mahaney. Counting and random generation of strings in regular languages. In Symp.\ on Discrete Algorithms (SODA), p. 551--557, 1995.
[28]
P. Kilpeläinen and R. Tuhkanen. Regular expressions with numerical occurrence indicators -- preliminary results. In Symp. on Prog. Lang. and Software Tools (SPLST), p. 163--173, 2003.
[29]
P. Kilpeläinen and R. Tuhkanen. One-unambiguity of regular expressions with numeric occurrence indicators. Information and Computation, 205(6):890--916, 2007.
[30]
S. C. Kleene. Automata Studies, chapter Representations of events in nerve sets and finite automata, p. 3--42. Princeton Univ. Press, 1956.
[31]
L. Libkin and D. Vrgoc. Regular path queries on graphs with data. In International Conference on Database Theory (ICDT),2012. To appear.
[32]
Y. A. Liu and F. Yu. Solving regular path queries. In Intl. Conf. on Mathematics of Program Construction (MPC), p. 195--208, 2002.
[33]
W. Martens, F. Neven, and T. Schwentick. Complexity of decision problems for simple regular expressions. In Mathematical Foundations of Computer Science (MFCS), p. 889--900, 2004.
[34]
W. Martens, F. Neven, and T. Schwentick. Complexity of decision problems for XML schemas and chain regular expressions. SIAM J. Comput., 39(4):1486--1530, 2009.
[35]
A. O. Mendelzon and P. T. Wood. Finding regular simple paths in graph databases. SIAM J. Comput., 24(6):1235--1258, 1995.
[36]
J. Pérez, M. Arenas, and C. Gutierrez. Semantics and complexity of SPARQL. ACM Trans. Database Syst., 34(3), 2009.
[37]
J. Pérez, M. Arenas, and C. Gutierrez. nSPARQL: A navigational language for RDF. J. Web Sem., 8(4):255--270, 2010.
[38]
M. Schmidt, M. Meier, and G. Lausen. Foundations of SPARQL query optimization. In International Conference on Database Theory (ICDT), pages 4--33, 2010.
[39]
L. Stockmeyer. The complexity of decision problems in automata theory and logic. PhD thesis, Massachusetts Institute of Technology, 1974.
[40]
L. G. Valiant. The complexity of enumeration and reliability problems. SIAM J. Comput., 8(3):410--421, 1979.
[41]
M. Yannakakis. Graph-theoretic methods in database theory. In Principles of Database Systems (PODS), p. 230--242, 1990.

Cited By

View all

Index Terms

  1. The complexity of evaluating path expressions in SPARQL

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    PODS '12: Proceedings of the 31st ACM SIGMOD-SIGACT-SIGAI symposium on Principles of Database Systems
    May 2012
    332 pages
    ISBN:9781450312486
    DOI:10.1145/2213556
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 21 May 2012

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. graph data
    2. query evaluation
    3. regular expression

    Qualifiers

    • Research-article

    Conference

    SIGMOD/PODS '12
    Sponsor:

    Acceptance Rates

    PODS '12 Paper Acceptance Rate 26 of 101 submissions, 26%;
    Overall Acceptance Rate 642 of 2,707 submissions, 24%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)12
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 23 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Path Querying in Graph Databases: A Systematic Mapping StudyIEEE Access10.1109/ACCESS.2024.337197612(33154-33172)Online publication date: 2024
    • (2024)Evaluating regular path queries on compressed adjacency matricesThe VLDB Journal10.1007/s00778-024-00885-634:1Online publication date: 26-Nov-2024
    • (2023)Evaluating Regular Path Queries on Compressed Adjacency MatricesString Processing and Information Retrieval10.1007/978-3-031-43980-3_4(35-48)Online publication date: 26-Sep-2023
    • (2022)Querying GraphsundefinedOnline publication date: 25-Feb-2022
    • (2021)Distributed processing of regular path queries in RDF graphsKnowledge and Information Systems10.1007/s10115-020-01536-263:4(993-1027)Online publication date: 13-Jan-2021
    • (2020)JSONInformation Systems10.1016/j.is.2019.10147889:COnline publication date: 1-Mar-2020
    • (2020)Explaining Results of Path Queries on GraphsSoftware Foundations for Data Interoperability and Large Scale Graph Data Analytics10.1007/978-3-030-61133-0_7(84-98)Online publication date: 6-Nov-2020
    • (2019)Querying knowledge graphs with extended property pathsSemantic Web10.3233/SW-19036510:6(1127-1168)Online publication date: 1-Jan-2019
    • (2019)Context-Free Grammars for Deterministic Regular Expressions with InterleavingTheoretical Aspects of Computing – ICTAC 201910.1007/978-3-030-32505-3_14(235-252)Online publication date: 22-Oct-2019
    • (2018)RDF Storage and QueryingInformation Retrieval and Management10.4018/978-1-5225-5191-1.ch019(415-433)Online publication date: 2018
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media