Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Dichotomies for Evaluating Simple Regular Path Queries

Published: 15 October 2019 Publication History

Abstract

Regular path queries (RPQs) are a central component of graph databases. We investigate decision and enumeration problems concerning the evaluation of RPQs under several semantics that have recently been considered: arbitrary paths, shortest paths, paths without node repetitions (simple paths), and paths without edge repetitions (trails).
Whereas arbitrary and shortest paths can be dealt with efficiently, simple paths and trails become computationally difficult already for very small RPQs. We study RPQ evaluation for simple paths and trails from a parameterized complexity perspective and define a class of simple transitive expressions that is prominent in practice and for which we can prove dichotomies for the evaluation problem. We observe that, even though simple path and trail semantics are intractable for RPQs in general, they are feasible for the vast majority of RPQs that are used in practice. At the heart of this study is a result of independent interest: the two disjoint paths problem in directed graphs is W[1]-hard if parameterized by the length of one of the two paths.

Supplementary Material

a16-martens-apndx.pdf (martens.zip)
Supplemental movie, appendix, image and software files for, Dichotomies for Evaluating Simple Regular Path Queries

References

[1]
Margareta Ackerman and Jeffrey Shallit. 2009. Efficient enumeration of words in regular languages. Theoretical Computer Science (TCS) 410, 37 (2009), 3461--3470.
[2]
Noga Alon, Raphael Yuster, and Uri Zwick. 1995. Color-coding. Journal of the ACM 42, 4 (1995), 844--856.
[3]
Renzo Angles, Marcelo Arenas, Pablo Barceló, Peter A. Boncz, George H. L. Fletcher, Claudio Gutierrez, Tobias Lindaaker, Marcus Paradies, Stefan Plantikow, Juan F. Sequeda, Oskar van Rest, and Hannes Voigt. 2018. G-CORE: A core for future graph query languages. In International Conference on Management of Data (SIGMOD). 1421--1432.
[4]
Renzo Angles, Marcelo Arenas, Pablo Barceló, Aidan Hogan, Juan L. Reutter, and Domagoj Vrgoč. 2017. Foundations of modern query languages for graph databases. Comput. Surveys 50, 5 (2017), 68:1--68:40.
[5]
Marcelo Arenas, Sebastián Conca, and Jorge Pérez. 2012. Counting beyond a Yottabyte, or how SPARQL 1.1 property paths will prevent adoption of the standard. In International Conference on World Wide Web (WWW). 629--638.
[6]
Guillaume Bagan, Angela Bonifati, and Benoît Groz. 2012. A trichotomy for regular simple path queries on graphs. CoRR abs/1212.6857 (2012). http://arxiv.org/abs/1212.6857.
[7]
Guillaume Bagan, Angela Bonifati, and Benoît Groz. 2013. A trichotomy for regular simple path queries on graphs. In Symposium on Principles of Database Systems (PODS). 261--272.
[8]
Pablo Barceló. 2013. Querying graph databases. In Symposium on Principles of Database Systems (PODS). 175--188.
[9]
Geert Jan Bex, Wim Martens, Frank Neven, and Thomas Schwentick. 2005. Expressiveness of XSDs: From practice to theory, there and back again. In International Conference on World Wide Web (WWW). 712--721.
[10]
Geert Jan Bex, Frank Neven, and Jan Van den Bussche. 2004. DTDs versus XML schema: A practical study. In Proceedings of the 7th International Workshop on the Web and Databases (WebDB). 79--84.
[11]
Geert Jan Bex, Frank Neven, Thomas Schwentick, and Stijn Vansummeren. 2010. Inference of concise regular expressions and DTDs. ACM Transactions on Database Systems 35, 2 (2010), 11:1--11:47.
[12]
Geert Jan Bex, Frank Neven, and Stijn Vansummeren. 2007. Inferring XML schema definitions from XML data (VLDB). In International Conference on Very Large Data Bases. 998--1009.
[13]
Andreas Björklund, Thore Husfeldt, and Sanjeev Khanna. 2004. Approximating longest directed paths and cycles. In International Colloquium on Automata, Languages and Programming (ICALP). 222--233.
[14]
Angela Bonifati, Wim Martens, and Thomas Tim. 2019. Navigating the maze of wikidata query logs. In The Web Conference (WWW). ACM. To appear.
[15]
Angela Bonifati, Wim Martens, and Thomas Timm. 2017. An analytical study of large SPARQL query logs. Proceedings of the VLDB Endowment (PVLDB) 11, 2 (2017), 149--161.
[16]
Janusz A. Brzozowski. 1964. Derivatives of regular expressions. Journal of the ACM 11, 4 (Oct. 1964), 481--494.
[17]
Leizhen Cai and Junjie Ye. 2016. Finding two edge-disjoint paths with length constraints. In International Workshop on Graph-Theoretic Concepts in Computer Science (WG). 62–73.
[18]
Yijia Chen and Jörg Flum. 2007. On parameterized path and chordless path problems. In IEEE Conference on Computational Complexity (CCC). 250--263.
[19]
Mariano P. Consens and Alberto O. Mendelzon. 1990. GraphLog: A visual formalism for real life recursion. In Symposium on Principles of Database Systems (PODS). 404--416.
[20]
Isabel F. Cruz, Alberto O. Mendelzon, and Peter T. Wood. 1987. A graphical query language supporting recursion. In ACM SIGMOD International Conference on Management of Data (SIGMOD). 323--330.
[21]
Marek Cygan, Fedor V. Fomin, Lukasz Kowalik, Daniel Lokshtanov, Daniel Marx, Marcin Pilipczuk, Michal Pilipczuk, and Saket Saurabh. 2015. Parameterized Algorithms. Springer.
[22]
Holger Dell. 2017. Personal communication.
[23]
Rodney G. Downey and Michael R. Fellows. 1995. Fixed-parameter tractability and completeness I: Basic results. SIAM J. Comput. 24, 4 (1995), 873--921.
[24]
Rodney G. Downey and Michael R. Fellows. 1995. Fixed-parameter tractability and completeness II: On completeness for W[1]. Theoretical Computer Science (TCS) 141, 1 (1995), 109--131.
[25]
Jörg Flum and Martin Grohe. 2004. The parameterized complexity of counting problems. SIAM J. Comput. 33, 4 (2004), 892--922.
[26]
Fedor V. Fomin, Daniel Lokshtanov, Fahad Panolan, and Saket Saurabh. 2016. Efficient computation of representative families with applications in parameterized and exact algorithms. Journal of the ACM 63, 4 (2016), 29:1--29:60.
[27]
Steven Fortune, John Hopcroft, and James Wyllie. 1980. The directed subgraph homeomorphism problem. Theoretical Computer Science (TCS) 10, 2 (1980), 111--121.
[28]
Konstantin Golenberg, Benny Kimelfeld, and Yehoshua Sagiv. 2011. Optimizing and parallelizing ranked enumeration. Proceedings of the VLDB Endowment (PVLDB) 4, 11 (2011), 1028--1039.
[29]
Martin Grohe and Magdalena Grüber. 2007. Parameterized approximability of the disjoint cycle problem. In International Colloquium on Automata, Languages and Programming (ICALP). 363--374.
[30]
Oren Kalinsky, Yoav Etsion, and Benny Kimelfeld. 2017. Flexible caching in trie joins. In International Conference on Extending Database Technology (EDBT). 282--293.
[31]
Sampath Kannan, Z. Sweedyk, and Stephen R. Mahaney. 1995. Counting and random generation of strings in regular languages. In Symposium on Discrete Algorithms (SODA). 551--557.
[32]
Benny Kimelfeld and Yehoshua Sagiv. 2013. Extracting minimum-weight tree patterns from a schema with neighborhood constraints. In International Conference on Database Theory (ICDT). 249--260.
[33]
Andrea S. LaPaugh and Christos H. Papadimitriou. 1984. The even-path problem for graphs and digraphs. Networks 14, 4 (1984), 507--513.
[34]
Andrea S. LaPaugh and Ronald L. Rivest. 1980. The subgraph homeomorphism problem. J. Comput. System Sci. 20, 2 (1980), 133--149.
[35]
Eugene L. Lawler. 1972. A procedure for computing the K best solutions to discrete optimization problems and its application to the shortest path problem. Management Science 18, 7 (1972), 401--405.
[36]
Leonid Libkin, Wim Martens, and Domagoj Vrgoč. 2016. Querying graphs with data. Journal of the ACM 63, 2 (2016), 14:1--14:53.
[37]
Katja Losemann and Wim Martens. 2013. The complexity of regular expressions and property paths in SPARQL. ACM Transactions on Database Systems 38, 4 (2013), 24:1--24:39.
[38]
Erkki Mäkinen. 1997. On lexicographic enumeration of regular and context-free languages. Acta Cybernetica 13, 1 (1997), 55--62.
[39]
Wim Martens, Frank Neven, Matthias Niewerth, and Thomas Schwentick. 2017. BonXai: Combining the simplicity of DTD with the expressiveness of XML schema. ACM Transactions on Database Systems 42, 3 (2017), 15:1--15:42.
[40]
Wim Martens, Frank Neven, and Thomas Schwentick. 2004. Complexity of decision problems for simple regular expressions. In Mathematical Foundations of Computer Science (MFCS). 889--900.
[41]
Wim Martens and Tina Trautner. 2018. Evaluation and enumeration problems for regular path queries. In International Conference on Database Theory (ICDT). 19:1--19:21.
[42]
Alberto O. Mendelzon and Peter T. Wood. 1995. Finding regular simple paths in graph databases. SIAM J. Comput. 24, 6 (1995), 1235--1258.
[43]
B. Monien. 1985. How to find long paths efficiently. In Analysis and Design of Algorithms for Combinatorial Problems. North-Holland Mathematics Studies, Vol. 109. North-Holland, 239--254.
[44]
Katta G. Murty. 1968. An algorithm for ranking all the assignments in order of increasing cost. Operations Research 16, 3 (1968), 682--687.
[45]
Neo4j. 2017. Intro to Cypher. https://neo4j.com/developer/cypher-query-language/.
[46]
OpenCypher [n.d.]. OpenCypher. www.opencypher.org. Visited on Sept. 14, 2017.
[47]
Yehoshua Perl and Yossi Shiloach. 1978. Finding two disjoint paths between two pairs of vertices in a graph. Journal of the ACM 25, 1 (1978), 1--9.
[48]
Stefan Plantikow, Mats Rydberg, and Petra Selmer. [n.d.]. CIP2017-01-18 -- Configurable Pattern Matching Semantics. https://github.com/boggle/openCypher/blob/isomatch/cip/1.accepted/CIP2017-01-18-configurable-pattern-matching-semantics.adoc. Visited on Aug. 08, 2017.
[49]
Steven Skiena. 2008. The Algorithm Design Manual (2nd ed.). Springer.
[50]
Aleksandrs Slivkins. 2010. Parameterized tractability of edge-disjoint paths on directed acyclic graphs. SIAM Journal on Discrete Mathematics 24, 1 (2010), 146--157.
[51]
Dimitri Surinx, George H. L. Fletcher, Marc Gyssens, Dirk Leinders, Jan Van den Bussche, Dirk Van Gucht, Stijn Vansummeren, and Yuqing Wu. 2015. Relative expressive power of navigational querying on graphs using transitive closure. Logic Journal of the IGPL 23, 5 (2015), 759--788.
[52]
Leslie G. Valiant. 1979. The complexity of enumeration and reliability problems. SIAM J. Comput. 8, 3 (1979), 410--421.
[53]
Domagoj Vrgoč. 2018. Personal communication. After a talk of Wim Martens at PUC Chile, the query was discussed in more detail and Domagoj Vrgoč, who attended the talk, informed us that he wrote the query.
[54]
W3C SPARQL 2013. SPARQL 1.1 Query Language. https://www.w3.org/TR/sparql11-query/. World Wide Web Consortium.
[55]
World Wide Web Consortium [n.d.]. World Wide Web Consortium. www.w3.org. Visited on Sept. 14, 2017.
[56]
Mihalis Yannakakis. 1990. Graph-theoretic methods in database theory. In Symposium on Principles of Database Systems (PODS). 230--242.
[57]
Jin Y. Yen. 1971. Finding the K shortest loopless paths in a network. Management Science 17, 11 (1971), 712--716.
[58]
Jin Y. Yen. 1972. Finding the lengths of all shortest paths in N-node nonnegative-distance complete networks using ½N3 additions and N3 comparisons. Journal of the ACM 19, 3 (1972), 423--424.

Cited By

View all
  • (2024)Path Querying in Graph Databases: A Systematic Mapping StudyIEEE Access10.1109/ACCESS.2024.337197612(33154-33172)Online publication date: 2024
  • (2023)Representing Paths in Graph Database Pattern MatchingProceedings of the VLDB Endowment10.14778/3587136.358715116:7(1790-1803)Online publication date: 8-May-2023
  • (2023)Conjunctive Regular Path Queries under Injective SemanticsProceedings of the 42nd ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3584372.3588664(231-240)Online publication date: 18-Jun-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Database Systems
ACM Transactions on Database Systems  Volume 44, Issue 4
Best of EDBT 2017, Best of EDBT 2018, Best of ICDT 2018 and Regular Papers
December 2019
249 pages
ISSN:0362-5915
EISSN:1557-4644
DOI:10.1145/3366712
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 October 2019
Accepted: 01 May 2019
Revised: 01 March 2019
Received: 01 August 2018
Published in TODS Volume 44, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Graph databases
  2. parameterized complexity
  3. regular languages
  4. regular path queries

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)34
  • Downloads (Last 6 weeks)9
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Path Querying in Graph Databases: A Systematic Mapping StudyIEEE Access10.1109/ACCESS.2024.337197612(33154-33172)Online publication date: 2024
  • (2023)Representing Paths in Graph Database Pattern MatchingProceedings of the VLDB Endowment10.14778/3587136.358715116:7(1790-1803)Online publication date: 8-May-2023
  • (2023)Conjunctive Regular Path Queries under Injective SemanticsProceedings of the 42nd ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3584372.3588664(231-240)Online publication date: 18-Jun-2023
  • (2022)Towards Theory for Real-World DataProceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3517804.3526066(261-276)Online publication date: 12-Jun-2022
  • (2022)The Complexity of Regular Trail and Simple Path Queries on Undirected GraphsProceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3517804.3524149(165-174)Online publication date: 12-Jun-2022
  • (2022)Conjunctive Regular Path Queries with Capture GroupsACM Transactions on Database Systems10.1145/351423047:2(1-52)Online publication date: 23-May-2022
  • (2021)PG-Keys: Keys for Property GraphsProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3457561(2423-2436)Online publication date: 9-Jun-2021
  • (2021)Querying in the Age of Graph Databases and Knowledge GraphsProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3457545(2821-2828)Online publication date: 9-Jun-2021
  • (2020)Efficient Logspace Classes for Enumeration, Counting, and Uniform GenerationACM SIGMOD Record10.1145/3422648.342266149:1(52-59)Online publication date: 4-Sep-2020
  • (2020)Formal Languages in Information Extraction and Graph DatabasesBeyond the Horizon of Computability10.1007/978-3-030-51466-2_28(306-309)Online publication date: 29-Jun-2020
  • Show More Cited By

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media