Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Distinct Shortest Walk Enumeration for RPQs

Published: 14 May 2024 Publication History
  • Get Citation Alerts
  • Abstract

    We consider the Distinct Shortest Walks problem. Given two vertices s and t of a graph database D and a regular path query, we want to enumerate all walks of minimal length from s to t that carry a label that conforms to the query. Usual theoretical solutions turn out to be inefficient when applied to graph models that are closer to real-life systems, in particular because edges may carry multiple labels. Indeed, known algorithms may repeat the same answer exponentially many times. We propose an efficient algorithm for graph databases with multiple labels. The preprocessing runs in O(DxA) and the delay between two consecutive outputs is in O(λxA), where A is a nondeterministic automaton representing the query and L is the minimal length. The algorithm can handle epsilon-transitions in A or queries given as regular expressions at no additional cost.

    References

    [1]
    Margareta Ackerman and Jeffrey Shallit. 2009. Efficient enumeration of words in regular languages. Theoretical Computer Science, Vol. 410, 37 (2009), 3461--3470. https://doi.org/10.1016/j.tcs.2009.03.018 Implementation and Application of Automata (CIAA 2007).
    [2]
    Antoine Amarilli and Mikaël Monet. 2023. Enumerating Regular Languages with Bounded Delay. In 40th International Symposium on Theoretical Aspects of Computer Science (STACS'23) (LIPIcs, Vol. 254). Schloss Dagstuhl -- Leibniz-Zentrum für Informatik, Dagstuhl, Germany, 8:1--8:18. https://doi.org/10.4230/LIPIcs.STACS.2023.8
    [3]
    Renzo Angles, Marcelo Arenas, Pablo Barceló, Peter A. Boncz, George H. L. Fletcher, Claudio Gutierrez, Tobias Lindaaker, Marcus Paradies, Stefan Plantikow, Juan F. Sequeda, Oskar van Rest, and Hannes Voigt. 2018. G-CORE: A Core for Future Graph Query Languages. In SIGMOD. ACM, 1421--1432.
    [4]
    Renzo Angles, Marcelo Arenas, Pablo Barceló, Aidan Hogan, Juan L. Reutter, and Domagoj Vrgovc. 2017. Foundations of Modern Query Languages for Graph Databases. ACM Comput. Surv., Vol. 50, 5 (2017).
    [5]
    Arturs Backurs and Piotr Indyk. 2016. Which Regular Expression Patterns Are Hard to Match?. In 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS). 457--466. https://doi.org/10.1109/FOCS.2016.56
    [6]
    Philip Bille and Mikkel Thorup. 2009. Faster Regular Expression Matching. Lecture Notes in Computer Science, Vol. 5555. Springer, Berlin, Heidelberg, 171--182. https://doi.org/10.1007/978--3--642-02927--1_16
    [7]
    Anna Brügemann-Klein. 1993. Regular expressions into finite automata. Theoretical Computer Science, Vol. 120 (1993), 197--213.
    [8]
    Isabel F. Cruz, Alberto O. Mendelzon, and Peter T. Wood. 1987. A Graphical Query Language Supporting Recursion. In SIGMOD'87, Umeshwar Dayal and Irving L. Traiger (Eds.). ACM, 323--330. https://doi.org/10.1145/38713.38749
    [9]
    Alin Deutsch, Yu Xu, Mingxi Wu, and Victor E. Lee. 2019. TigerGraph: A Native MPP Graph Database. http://arxiv.org/abs/1901.08248 Preprint hrefhttps://arxiv.org/abs/1901.08248arXiv:1901.08249.
    [10]
    Massimo Equi, Veli M"akinen, Alexandru I. Tomescu, and Roberto Grossi. 2023. On the Complexity of String Matching for Graphs. ACM Trans. Algorithms, Vol. 19, 3, Article 21 (apr 2023), 25 pages. https://doi.org/10.1145/3588334
    [11]
    Benjamín Farías, Carlos Rojas, and Domagoj Vrgo?. 2023. Evaluating Regular Path Queries in GQL and SQL/PGQ: How Far Can The Classical Algorithms Take Us? (2023). https://doi.org/10.48550/arXiv.2306.02194 Preprint.
    [12]
    International Organization for Standardization. 2024. GQL. Standard under development ISO/IEC DIS 39075. https://www.iso.org/standard/76120.html To appear.
    [13]
    Nadime Francis, Amélie Gheerbrant, Paolo Guagliardo, Leonid Libkin, Victor Marsault, Wim Martens, Filip Murlak, Liat Peterfreund, Alexandra Rogova, and Domagoj Vrgo?. 2023. GPC: A Pattern Calculus for Property Graphs. In PODS'23. https://arxiv.org/abs/2210.16580
    [14]
    Nadime Francis and Victor Marsault. 2023. Enumerating regular languages in radix order : Revisiting the Ackerman-Shallit algorithm. hrefhttps://arxiv.org/abs/2310.13309ArXiv:2310.13309.
    [15]
    M.L. Fredman and R.E. Tarjan. 1984. Fibonacci Heaps And Their Uses In Improved Network Optimization Algorithms. (1984), 338--346. https://doi.org/10.1109/SFCS.1984.715934
    [16]
    Property Graph Query Language. 2021. PGQL 2.0 Specification. https://pgql-lang.org/spec/2.0/
    [17]
    Wim Martens, Matthias Niewerth, Tina Popp, Carlos Rojas, Stijn Vansummeren, and Domagoj Vrgovc. 2023. Representing Paths in Graph Database Pattern Matching. In VLDB'23, Vol. 16. 14 pages. https://doi.org/10.14778/3587136.3587151
    [18]
    Wim Martens and Tina Trautner. 2018. Evaluation and Enumeration Problems for Regular Path Queries. In 21st International Conference on Database Theory, ICDT 2018, March 26--29, 2018, Vienna, Austria (LIPIcs, Vol. 98), Benny Kimelfeld and Yael Amsterdamer (Eds.). Schloss Dagstuhl - Leibniz-Zentrum fü r Informatik, 19:1--19:21. https://doi.org/10.4230/LIPIcs.ICDT.2018.19
    [19]
    Gene Myers. 1992. A Four Russians algorithm for regular expression pattern matching. J. ACM, Vol. 39, 2 (April 1992), 432--448. https://doi.org/10.1145/128749.128755
    [20]
    Tina Popp. 2022. Evaluation and Enumeration of Regular Simple Path and Trail Queries. Ph.,D. Dissertation. Bayreuth. https://epub.uni-bayreuth.de/6606/
    [21]
    Yann Strozecki. 2021. Enumeration Complexity: Incremental Time, Delay and Space. Université de Versailles -- Saint-Quentin-en-Yvelines. Habilitation thesis.
    [22]
    Ken Thompson. 1968. Programming Techniques: Regular Expression Search Algorithm. Commun. ACM, Vol. 11, 6 (jun 1968), 419--422. https://doi.org/10.1145/363347.363387
    [23]
    TigerGraph. 2023. GSQL Language Reference (version 3.9). https://docs.tigergraph.com/gsql-ref/3.9/intro/

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the ACM on Management of Data
    Proceedings of the ACM on Management of Data  Volume 2, Issue 2
    PODS
    May 2024
    852 pages
    EISSN:2836-6573
    DOI:10.1145/3665155
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 14 May 2024
    Published in PACMMOD Volume 2, Issue 2

    Permissions

    Request permissions for this article.

    Author Tags

    1. all shortest walks
    2. deduplication
    3. enumeration complexity
    4. graph databases
    5. regular path queries

    Qualifiers

    • Research-article

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 27
      Total Downloads
    • Downloads (Last 12 months)27
    • Downloads (Last 6 weeks)9
    Reflects downloads up to 27 Jul 2024

    Other Metrics

    Citations

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media