Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3517804.3524159acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
research-article
Open access

Data Path Queries over Embedded Graph Databases

Published: 13 June 2022 Publication History

Abstract

This paper initiates the study of data-path query languages (in particular, regular data path queries (RDPQ) and conjunctive RDPQ (CRDPQ)) in the classic setting of embedded finite model theory, wherein each graph is "embedded" into a background infinite structure (with a decidable FO theory or fragments thereof). Our goal is to address the current lack of support for typed attribute data (e.g. integer arithmetics) in existing data-path query languages, which are crucial in practice. We propose an extension of register automata by allowing powerful constraints over the theory and the database as guards, and having two types of registers: registers that can store values from the active domain, and read-only registers that can store arbitrary values. We prove NL data complexity for (C)RDPQ over the Presburger arithmetic, the real-closed field, the existential theory of automatic structures and word equations with regular constraints. All these results strictly extend the known NL data complexity of RDPQ with only equality comparisons, and provides an answer to a recent open problem posed by Libkin et al. Among others, we introduce one crucial proof technique for obtaining NL data complexity for data path queries over embedded graph databases called "Restricted Register Collapse (RRC)", inspired by the notion of Restricted Quantifier Collapse (RQC) in embedded finite model theory.

References

[1]
Étienne André. 2019. What's decidable about parametric timed automata? Int. J. Softw. Tools Technol. Transf., Vol. 21, 2 (2019), 203--219. https://doi.org/10.1007/s10009-017-0467-0
[2]
Renzo Angles and Claudio Gutiérrez. 2008. Survey of graph database models. ACM Comput. Surv., Vol. 40, 1 (2008), 1:1--1:39. https://doi.org/10.1145/1322432.1322433
[3]
Pablo Barceló Baeza. 2013. Querying graph databases. In Proceedings of the 32nd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2013, New York, NY, USA - June 22 - 27, 2013, Richard Hull and Wenfei Fan (Eds.). ACM, 175--188. https://doi.org/10.1145/2463664.2465216
[4]
Jorge Baier, Dietrich Daroch, Juan L. Reutter, and Domagoj Vrgoc. 2017. Evaluating Navigational RDF Queries over the Web. In Proceedings of the 28th ACM Conference on Hypertext and Social Media, HT 2017, Prague, Czech Republic, July 4--7, 2017, Peter Dolog, Peter Vojtá s, Francesco Bonchi, and Denis Helic (Eds.). ACM, 165--174. https://doi.org/10.1145/3078714.3078731
[5]
Abebe Alemu Balcha. 2020. Curve Fitting and Least Square Analysis to Extrapolate for the Case of COVID-19 Status in Ethiopia. Advances in Infectious Diseases, Vol. 10 (2020). Issue 3.
[6]
Pablo Barceló, Gaëlle Fontaine, and Anthony Widjaja Lin. 2015. Expressive Path Queries on Graph with Data. Log. Methods Comput. Sci., Vol. 11, 4 (2015). https://doi.org/10.2168/LMCS-11(4:1)2015
[7]
Pablo Barceló, Leonid Libkin, Anthony Widjaja Lin, and Peter T. Wood. 2012. Expressive Languages for Path Queries over Graph-Structured Data. ACM Trans. Database Syst., Vol. 37, 4 (2012), 31:1--31:46. https://doi.org/10.1145/2389241.2389250
[8]
Saugata Basu. 2014. Algorithms in Real Algebraic Geometry: A Survey. CoRR, Vol. abs/1409.1534 (2014). showeprint[arXiv]1409.1534 http://arxiv.org/abs/1409.1534
[9]
Michael Benedikt and Leonid Libkin. 1996. On the Structure of Queries in Constraint Query Languages. In Proceedings, 11th Annual IEEE Symposium on Logic in Computer Science, New Brunswick, New Jersey, USA, July 27--30, 1996. IEEE Computer Society, 25--34. https://doi.org/10.1109/LICS.1996.561300
[10]
Michael Benedikt and Leonid Libkin. 2000. Relational queries over interpreted structures. J. ACM, Vol. 47, 4 (2000), 644--680. https://doi.org/10.1145/347476.347477
[11]
Michael Benedikt, Leonid Libkin, Thomas Schwentick, and Luc Segoufin. 2003. Definable relations and first-order query languages over strings. J. ACM, Vol. 50, 5 (2003), 694--751. https://doi.org/10.1145/876638.876642
[12]
Achim Blumensath and Erich Gradel. 2000. Automatic Structures. In 15th Annual IEEE Symposium on Logic in Computer Science, Santa Barbara, California, USA, June 26--29, 2000. IEEE Computer Society, 51--62. https://doi.org/10.1109/LICS.2000.855755
[13]
Achim Blumensath and Erich Gradel. 2004. Finite presentations of infinite structures: Automata and interpretations. Theory of Computing Systems, Vol. 37, 6 (2004), 641--674.
[14]
Mikołaj Bojanczyk, Claire David, Anca Muscholl, Thomas Schwentick, and Luc Segoufin. 2011. Two-variable logic on data words. TOCL, Vol. 12, 4 (2011), 27:1--27:26. https://doi.org/10.1145/1970398.1970403
[15]
Christina Boucher and Bin Ma. 2011. Closest string with outliers. BMC Bioinform., Vol. 12, S-1 (2011), S55. https://doi.org/10.1186/1471--2105--12-S1-S55
[16]
Diego Calvanese, Giuseppe De Giacomo, Maurizio Lenzerini, and Moshe Y. Vardi. 2000. Containment of Conjunctive Regular Path Queries with Inverse. In KR 2000, Principles of Knowledge Representation and Reasoning Proceedings of the Seventh International Conference, Breckenridge, Colorado, USA, April 11--15, 2000, Anthony G. Cohn, Fausto Giunchiglia, and Bart Selman (Eds.). Morgan Kaufmann, 176--185.
[17]
Alin Deutsch, Nadime Francis, Alastair Green, Keith Hare, Bei Li, Leonid Libkin, Tobias Lindaaker, Victor Marsault, Wim Martens, Jan Michels, Filip Murlak, Stefan Plantikow, Petra Selmer, Hannes Voigt, Oskar van Rest, Domagoj Vrgoc, Mingxi Wu, and Fred Zemke. 2021. Graph Pattern Matching in GQL and SQL/PGQ. arxiv: 2112.06217 [cs.DB]
[18]
Alin Deutsch and Val Tannen. 2001. Optimization Properties for Classes of Conjunctive Regular Path Queries. In Database Programming Languages, 8th International Workshop, DBPL 2001, Frascati, Italy, September 8--10, 2001, Revised Papers (Lecture Notes in Computer Science, Vol. 2397), Giorgio Ghelli and Gösta Grahne (Eds.). Springer, 21--39. https://doi.org/10.1007/3--540--46093--4_2
[19]
Volker Diekert. 2002. Makanin's Algorithm. In Algebraic Combinatorics on Words, M. Lothaire (Ed.). Encyclopedia of Mathematics and its Applications, Vol. 90. Cambridge University Press, Chapter 12, 387--442.
[20]
Volker Diekert, Artur Jez, and Wojciech Plandowski. 2016. Finding all solutions of equations in free groups and monoids with involution. Inf. Comput., Vol. 251 (2016), 263--286. https://doi.org/10.1016/j.ic.2016.09.009 Conference version in Proc. CSR 2014, LNCS 8476 (2014).
[21]
S Eilenberg, C.C Elgot, and J.C Shepherdson. 1969. Sets recognized by n-tape automata. Journal of Algebra, Vol. 13, 4 (1969), 447--464. https://doi.org/10.1016/0021--8693(69)90107-0
[22]
Herbert B. Enderton. 2001. Introduction to Mathematical Logic 2 ed.). Academic Press.
[23]
Daniela Florescu, Alon Y. Levy, and Dan Suciu. 1998. Query Containment for Conjunctive Queries with Regular Expressions. In Proceedings of the Seventeenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, June 1--3, 1998, Seattle, Washington, USA, Alberto O. Mendelzon and Jan Paredaens (Eds.). ACM Press, 139--148. https://doi.org/10.1145/275487.275503
[24]
Jörg Flum and Martin Ziegler. 1999. Pseudo-Finite Homogeneity and Saturation. J. Symb. Log., Vol. 64, 4 (1999), 1689--1699. https://doi.org/10.2307/2586806
[25]
E. Gradel, P. G. Kolaitis, L. Libkin, M. Marx, J. Spencer, M. Y. Vardi, Y. Venema, and S. Weinstein. 2007. Finite Model Theory and Its Applications. Springer.
[26]
Jens Gramm, Rolf Niedermeier, and Peter Rossmanith. 2003. Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems. Algorithmica, Vol. 37, 1 (2003), 25--42. https://doi.org/10.1007/s00453-003--1028--3
[27]
Christoph Haase, Stephan Kreutzer, Joël Ouaknine, and James Worrell. 2009. Reachability in Succinct and Parametric One-Counter Automata. In CONCUR 2009 - Concurrency Theory, 20th International Conference, CONCUR 2009, Bologna, Italy, September 1--4, 2009. Proceedings (Lecture Notes in Computer Science, Vol. 5710), Mario Bravetti and Gianluigi Zavattaro (Eds.). Springer, 369--383. https://doi.org/10.1007/978--3--642-04081--8_25
[28]
Jelle Hellings, Bart Kuijpers, Jan Van den Bussche, and Xiaowang Zhang. 2013. Walk logic as a framework for path query languages on graph databases. In Joint 2013 EDBT/ICDT Conferences, ICDT '13 Proceedings, Genoa, Italy, March 18--22, 2013, Wang-Chiew Tan, Giovanna Guerrini, Barbara Catania, and Anastasios Gounaris (Eds.). ACM, 117--128. https://doi.org/10.1145/2448496.2448512
[29]
Wilfrid Hodges. 1997. A Shorter Model Theory. Cambridge University Press.
[30]
Artur Jez. 2016. Recompression: a simple and powerful technique for word equations. J. ACM, Vol. 63, 1 (Mar 2016), 4:1--4:51. https://doi.org/10.1145/2743014
[31]
Michael Kaminski and Nissim Francez. 1994. Finite-Memory Automata. Theor. Comput. Sci., Vol. 134, 2 (1994), 329--363. https://doi.org/10.1016/0304--3975(94)90242--9
[32]
Eryk Kopczynski and Anthony Widjaja To. 2010. Parikh Images of Grammars: Complexity and Applications. In Proceedings of the 25th Annual IEEE Symposium on Logic in Computer Science, LICS 2010, 11--14 July 2010, Edinburgh, United Kingdom. 80--89. https://doi.org/10.1109/LICS.2010.21
[33]
Dexter C. Kozen. 2006. Theory of Computation. Springer.
[34]
G. M. Kuper, L. Libkin, and J. Paredaens (Eds.). 2000. Constraint Databases. Springer.
[35]
Leonid Libkin. 2004. Elements of Finite Model Theory. Springer.
[36]
Leonid Libkin, Wim Martens, and Domagoj Vrgoc. 2016. Querying Graphs with Data. J. ACM, Vol. 63, 2 (2016), 14:1--14:53. https://doi.org/10.1145/2850413
[37]
Leonid Libkin and Domagoj Vrgoc. 2012. Regular path queries on graphs with data. In 15th International Conference on Database Theory, ICDT '12, Berlin, Germany, March 26--29, 2012, Alin Deutsch (Ed.). ACM, 74--85. https://doi.org/10.1145/2274576.2274585
[38]
Alberto O. Mendelzon and Peter T. Wood. 1995. Finding Regular Simple Paths in Graph Databases. SIAM J. Comput., Vol. 24, 6 (1995), 1235--1258. https://doi.org/10.1137/S009753979122370X
[39]
Wojciech Plandowski. 2004. Satisfiability of word equations with constants is in PSPACE. J. ACM, Vol. 51, 3 (2004), 483--496. https://doi.org/10.1145/990308.990312
[40]
Wojciech Plandowski and Wojciech Rytter. 1998. Application of Lempel-Ziv Encodings to the Solution of Word Equations. In ICALP (LNCS, Vol. 1443), Kim Guldstrand Larsen, Sven Skyum, and Glynn Winskel (Eds.). Springer, 731--742. https://doi.org/10.1007/BFb0055097
[41]
Luc Segoufin and Szymon Torunczyk. 2011. Automata based verification over linearly ordered data domains. In 28th International Symposium on Theoretical Aspects of Computer Science, STACS 2011, March 10--12, 2011, Dortmund, Germany (LIPIcs, Vol. 9), Thomas Schwentick and Christoph Dürr (Eds.). Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 81--92. https://doi.org/10.4230/LIPIcs.STACS.2011.81
[42]
Anthony Widjaja To. 2009. Model Checking FO(R) over One-Counter Processes and beyond. In Computer Science Logic, 23rd international Workshop, CSL 2009, 18th Annual Conference of the EACSL, Coimbra, Portugal, September 7--11, 2009. Proceedings. 485--499. https://doi.org/10.1007/978--3--642-04027--6_35
[43]
Heribert Vollmer. 1999. Introduction to Circuit Complexity. Springer.
[44]
Domagoj Vrgoc, Carlos Rojas, Renzo Angles, Marcelo Arenas, Diego Arroyuelo, Carlos Buil Aranda, Aidan Hogan, Gonzalo Navarro, Cristian Riveros, and Juan Romero. 2021. MillenniumDB: A Persistent, Open-Source, Graph Database. CoRR, Vol. abs/2111.01540 (2021). [arXiv]2111.01540 https://arxiv.org/abs/2111.01540
[45]
Peter T. Wood. 2012. Query languages for graph databases. SIGMOD Rec., Vol. 41, 1 (2012), 50--60. https://doi.org/10.1145/2206869.2206879

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PODS '22: Proceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems
June 2022
462 pages
ISBN:9781450392600
DOI:10.1145/3517804
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 June 2022

Check for updates

Author Tags

  1. complexity
  2. data graphs
  3. embedded finite models
  4. regular path queries

Qualifiers

  • Research-article

Funding Sources

Conference

SIGMOD/PODS '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 642 of 2,707 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)140
  • Downloads (Last 6 weeks)17
Reflects downloads up to 30 Aug 2024

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media