Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Minimization of Tree Patterns

Published: 25 July 2018 Publication History

Abstract

Many of today’s graph query languages are based on graph pattern matching. We investigate optimization of tree-shaped patterns that have transitive closure operators. Such patterns not only appear in the context of graph databases but also were originally studied for querying tree-structured data, where they can perform child, descendant, node label, and wildcard tests.
The minimization problem aims at reducing the number of nodes in patterns and goes back to the early 2000s. We provide an example showing that, in contrast to earlier claims, tree patterns cannot be minimized by deleting nodes only. The example resolves the M =? NR problem, which asks if a tree pattern is minimal if and only if it is nonredundant. The example can be adapted to prove that minimization is ΣP2-complete, which resolves another question that was open since the early research on the problem. The latter result shows that, unless NP = ΠP2, more general approaches for minimizing tree patterns are also bound to fail in general.

References

[1]
Serge Abiteboul, Luc Segoufin, and Victor Vianu. 2009. Static analysis of active XML systems. ACM Trans. Database Syst. 34, 4 (2009), 23:1--23:44.
[2]
Sihem Amer-Yahia, SungRan Cho, Laks V. S. Lakshmanan, and Divesh Srivastava. 2002. Tree pattern query minimization. VLDB J. 11, 4 (2002), 315--331.
[3]
Renzo Angles, Marcelo Arenas, Pablo Barceló, Aidan Hogan, Juan L. Reutter, and Domagoj Vrgoc. 2016. Foundations of modern graph query languages. arXiv:1610.06264.
[4]
Marcelo Arenas, Pablo Barceló, Leonid Libkin, and Filip Murlak. 2014. Foundations of Data Exchange. Cambridge University Press.
[5]
Marcelo Arenas, Sebastián Conca, and Jorge Pérez. 2012. Counting beyond a yottabyte, or how SPARQL 1.1 property paths will prevent adoption of the standard. In Proceedings of the World Wide Web Conference (WWW’12). 629--638.
[6]
Marcelo Arenas and Leonid Libkin. 2008. XML data exchange: Consistency and query answering. J. ACM 58, 1 (2008), 7:1--7:72.
[7]
Pablo Barceló, Leonid Libkin, Antonella Poggi, and Cristina Sirangelo. 2010. XML with incomplete information. J. ACM 58, 1 (2010), 4.
[8]
Michael Benedikt, Wenfei Fan, and Floris Geerts. 2008. XPath satisfiability in the presence of DTDs. J. ACM 55, 2 (2008), 8:1--8:79.
[9]
Henrik Björklund, Wim Martens, and Thomas Schwentick. 2011. Conjunctive query containment over trees. J. Comput. Syst. Sci. 77, 3 (2011), 450--472.
[10]
Henrik Björklund, Wim Martens, and Thomas Schwentick. 2013. Validity of tree pattern queries with respect to schema information. In Proceedings of the International Symposium on Mathematical Foundations of Computer Science (MFCS’13). 171--182.
[11]
Angela Bonifati, Wim Martens, and Thomas Timm. 2017. An analytical study of large SPARQL query logs. arXiv:1708.00363.
[12]
Ashok K. Chandra and Philip M. Merlin. 1977. Optimal implementation of conjunctive queries in relational data bases. In Proceedings of the Symposium on Theory of Computing (STOC’77). 77--90.
[13]
Ding Chen and Chee Yong Chan. 2008. Minimization of tree pattern queries with constraints. In Proceedings of the International Conference on Management of Data (SIGMOD’08). 609--622.
[14]
Wojciech Czerwiński, Wim Martens, Matthias Niewerth, and Paweł Parys. 2016. Minimization of tree pattern queries. In Proceedings of the Symposium on Principles of Database Systems (PODS’16). 43--54.
[15]
Wojciech Czerwiński, Wim Martens, Paweł Parys, and Marcin Przybyłko. 2015. The (almost) complete guide to tree pattern containment. In Proceedings of the Symposium on Principles of Database Systems (PODS’15). 117--130.
[16]
Claire David, Amélie Gheerbrant, Leonid Libkin, and Wim Martens. 2013. Containment of pattern-based queries over data trees. In Proceedings of the International Conference on Database Theory (ICDT’13). 201--212.
[17]
Bettina Fazzinga, Sergio Flesca, and Filippo Furfaro. 2010. On the expressiveness of generalization rules for XPath query relaxation. In Proceedings of the International Database Engineering and Applications Symposium (IDEAS’10). 157--168.
[18]
Bettina Fazzinga, Sergio Flesca, and Filippo Furfaro. 2011. XPath query relaxation through rewriting rules. IEEE Trans. Knowl. Data Eng. 23, 10 (2011), 1583--1600.
[19]
Sergio Flesca, Filippo Furfaro, and Elio Masciari. 2003. On the minimization of XPath queries. In Proceedings of the International Conference on Very Large Data Bases (VLDB’03). 153--164.
[20]
Sergio Flesca, Filippo Furfaro, and Elio Masciari. 2008. On the minimization of XPath queries. J. ACM 55, 1 (2008), 2:1--2:46.
[21]
Amélie Gheerbrant, Leonid Libkin, and Cristina Sirangelo. 2013. Reasoning about pattern-based XML queries. In Proceedings of the International Conference on Web Reasoning and Rule Systems (RR’13). 4--18.
[22]
Georg Gottlob, Gianluigi Greco, Nicola Leone, and Francesco Scarcello. 2016. Hypertree decompositions: Questions and answers. In Proceedings of the Symposium on Principles of Database Systems (PODS’16). 57--74.
[23]
Georg Gottlob, Christoph Koch, and Klaus U. Schulz. 2006. Conjunctive queries over trees. J. ACM 53, 2 (2006), 238--272.
[24]
Gremlin. 2013. Gremlin Language. Retrieved from https://github.com/tinkerpop/gremlin/wiki.
[25]
JSONPath. 2016. JSONPath Online Evaluator. Retrieved from http://jsonpath.com/.
[26]
Benny Kimelfeld and Yehoshua Sagiv. 2008. Revisiting redundancy and minimization in an XPath fragment. In Proceedings of the International Conference on Extending Database Technology (EDBT’08). 61--72.
[27]
Leonid Libkin, Wim Martens, and Domagoj Vrgoc. 2016. Querying graphs with data. J. ACM 63, 2 (2016), 14.
[28]
Katja Losemann and Wim Martens. 2013. The complexity of regular expressions and property paths in SPARQL. ACM Trans. Database Syst. 38, 4 (2013), 24.
[29]
Gerome Miklau and Dan Suciu. 2004. Containment and equivalence for a fragment of XPath. J. ACM 51, 1 (2004), 2--45.
[30]
Tova Milo and Dan Suciu. 1999. Index structures for path expressions. In Proceedings of the International Conference on Database Theory (ICDT’99). 277--295.
[31]
Neo4j Cypher. 2016. The Cypher Query Language. Retrieved from https://neo4j.com/docs/developer-manual/current/cypher/.
[32]
Frank Neven and Thomas Schwentick. 2006. On the complexity of XPath containment in the presence of disjunction, DTDs, and variables. Log. Methods Comput. 2, 3 (2006), Article 1.
[33]
Prakash Ramanan. 2002. Efficient algorithms for minimizing tree pattern queries. In Proceedings of the International Conference on Management of Data (SIGMOD’02). 299--309.
[34]
Jonathan Robie, Don Chamberlin, Michael Dyck, and John Snelson. 2014. XML Path Language 3.0. Technical Report. World Wide Web Consortium. http://www.w3.org/TR/2014/REC-xpath-30-20140408/.
[35]
Ian Robinson, Jim Webber, and Emil Eifrem. 2015. Graph Databases (2nd ed.). O’Reilly.
[36]
Marko A. Rodriguez. 2015. The Gremlin graph traversal machine and language (invited talk). In Proceedings of the Symposium on Database Programming Languages (DBPL’15). 1--10.
[37]
Slawek Staworko and Piotr Wieczorek. 2015. Characterizing XML twig queries with examples. In Proceedings of the International Conference on Database Theory (ICDT’15). 144--160.
[38]
Larry J. Stockmeyer. 1976. The polynomial-time hierarchy. Theor. Comput. Sci. 3, 1 (1976), 1--22.
[39]
Balder ten Cate and Maarten Marx. 2009. Axiomatizing the logical core of XPath 2.0. Theory Comput. Syst. 44, 4 (2009), 561--589.
[40]
W3C Sparql. 2013. SPARQL 1.1 Query Language. Retrieved from https://www.w3.org/TR/sparql11-query/.
[41]
Wikidata Examples. 2016. WikiData SPARQL Query Service Examples. Retrieved from https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/queries/examples.
[42]
Peter T. Wood. 2001. Minimising simple XPath expressions. In Proceedings of the 4th International Workshop on the Web and Databases (WebDB’01). 13--18.
[43]
Wanhong Xu and Z. Meral Özsoyoglu. 2005. Rewriting XPath queries using materialized views. In Proceedings of the International Conference on Very Large Data Bases (VLDB’05). 121--132.

Cited By

View all
  • (2022)Towards Theory for Real-World DataProceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3517804.3526066(261-276)Online publication date: 12-Jun-2022
  • (2022)Conjunctive Regular Path Queries with Capture GroupsACM Transactions on Database Systems10.1145/351423047:2(1-52)Online publication date: 23-May-2022
  • (2022)Discovering the Roots: Uniform Closure Results for Algebraic Classes Under FactoringJournal of the ACM10.1145/351035969:3(1-39)Online publication date: 11-Jun-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Journal of the ACM
Journal of the ACM  Volume 65, Issue 4
Distributed Computing, Cryptography, Distributed Computing, Cryptography, Coding Theory, Automata Theory, Complexity Theory, Programming Languages, Algorithms, Invited Paper Foreword and Databases
August 2018
307 pages
ISSN:0004-5411
EISSN:1557-735X
DOI:10.1145/3208081
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 July 2018
Accepted: 01 January 2018
Revised: 01 October 2017
Received: 01 January 2017
Published in JACM Volume 65, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. XML
  2. XPath
  3. complexity
  4. graph databases
  5. graphs
  6. optimization
  7. pattern matching
  8. tree patterns
  9. trees

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • Poland's National Science Centre
  • Deutsche Forschungsgemeinschaft

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)25
  • Downloads (Last 6 weeks)3
Reflects downloads up to 30 Aug 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Towards Theory for Real-World DataProceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3517804.3526066(261-276)Online publication date: 12-Jun-2022
  • (2022)Conjunctive Regular Path Queries with Capture GroupsACM Transactions on Database Systems10.1145/351423047:2(1-52)Online publication date: 23-May-2022
  • (2022)Discovering the Roots: Uniform Closure Results for Algebraic Classes Under FactoringJournal of the ACM10.1145/351035969:3(1-39)Online publication date: 11-Jun-2022
  • (2022)Information Acquisition Under Resource Limitations in a Noisy EnvironmentJournal of the ACM10.1145/351002469:3(1-37)Online publication date: 27-Jun-2022
  • (2022)Learning Disjunctive Multiplicity Expressions and Disjunctive Generalize Multiplicity Expressions From Both Positive and Negative ExamplesThe Computer Journal10.1093/comjnl/bxac03766:7(1733-1748)Online publication date: 18-Apr-2022
  • (2020)Conjunctive Regular Path Queries with String VariablesProceedings of the 39th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3375395.3387663(361-374)Online publication date: 14-Jun-2020
  • (2020)JSONInformation Systems10.1016/j.is.2019.10147889:COnline publication date: 1-Mar-2020
  • (2019)An analytical study of large SPARQL query logsThe VLDB Journal10.1007/s00778-019-00558-929:2-3(655-679)Online publication date: 2-Aug-2019

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media