Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Semantic Optimization of Conjunctive Queries

Published: 29 October 2020 Publication History

Abstract

This work deals with the problem of semantic optimization of the central class of conjunctive queries (CQs). Since CQ evaluation is NP-complete, a long line of research has focussed on identifying fragments of CQs that can be efficiently evaluated. One of the most general restrictions corresponds to generalized hypetreewidth bounded by a fixed constant k ≥ 1; the associated fragment is denoted GHWk. A CQ is semantically in GHWk if it is equivalent to a CQ in GHWk. The problem of checking whether a CQ is semantically in GHWk has been studied in the constraint-free case, and it has been shown to be NP-complete. However, in case the database is subject to constraints such as tuple-generating dependencies (TGDs) that can express, e.g., inclusion dependencies, or equality-generating dependencies (EGDs) that capture, e.g., key dependencies, a CQ may turn out to be semantically in GHWk under the constraints, while not being semantically in GHWk without the constraints. This opens avenues to new query optimization techniques. In this article, we initiate and develop the theory of semantic optimization of CQs under constraints. More precisely, we study the following natural problem: Given a CQ and a set of constraints, is the query semantically in GHWk, for a fixed k ≥ 1, under the constraints, or, in other words, is the query equivalent to one that belongs to GHWk over all those databases that satisfy the constraints? We show that, contrary to what one might expect, decidability of CQ containment is a necessary but not a sufficient condition for the decidability of the problem in question. In particular, we show that checking whether a CQ is semantically in GHW1 is undecidable in the presence of full TGDs (i.e., Datalog rules) or EGDs. In view of the above negative results, we focus on the main classes of TGDs for which CQ containment is decidable and that do not capture the class of full TGDs, i.e., guarded, non-recursive, and sticky sets of TGDs, and show that the problem in question is decidable, while its complexity coincides with the complexity of CQ containment. We also consider key dependencies over unary and binary relations, and we show that the problem in question is decidable in elementary time. Furthermore, we investigate whether being semantically in GHWk alleviates the cost of query evaluation. Finally, in case a CQ is not semantically in GHWk, we discuss how it can be approximated via a CQ that falls in GHWk in an optimal way. Such approximations might help finding “quick” answers to the input query when exact evaluation is intractable.

References

[1]
Serge Abiteboul, Richard Hull, and Victor Vianu. 1995. Foundations of Databases. Addison-Wesley.
[2]
Jean-François Baget, Marie-Laure Mugnier, Sebastian Rudolph, and Michaël Thomazo. 2011. Walking the complexity lines for generalized guarded existential rules. In Proceedings of the IJCAI. 712--717.
[3]
Vince Bárány, Georg Gottlob, and Martin Otto. 2014. Querying the guarded fragment. Logic. Methods Comput. Sci. 10, 2 (2014).
[4]
Pablo Barceló, Cristina Feier, Carsten Lutz, and Andreas Pieris. 2019. When is ontology-mediated querying efficient? In Proceedings of the LICS. 1--13.
[5]
Pablo Barceló, Georg Gottlob, and Andreas Pieris. 2016. Semantic acyclicity under constraints. In Proceedings of the PODS. 343--354.
[6]
Pablo Barceló, Leonid Libkin, and Miguel Romero. 2014. Efficient approximations of conjunctive queries. SIAM J. Comput. 43, 3 (2014), 1085--1130.
[7]
Pablo Barceló, Reinhard Pichler, and Sebastian Skritek. 2015. Efficient evaluation and approximation of well-designed pattern trees. In Proceedings of the PODS. 131--144.
[8]
Pablo Barceló, Miguel Romero, and Moshe Y. Vardi. 2016. Semantic acyclicity on graph databases. SIAM J. Comput. 45, 4 (2016), 1339--1376.
[9]
Catriel Beeri, Ronald Fagin, David Maier, Alberto O. Mendelzon, Jeffrey D. Ullman, and Mihalis Yannakakis. 1981. Properties of acyclic database schemes. In Proceedings of the STOC. 355--362.
[10]
Catriel Beeri and Moshe Y. Vardi. 1981. The implication problem for data dependencies. In Proceedings of the ICALP. 73--85.
[11]
Andrea Calì, Georg Gottlob, and Michael Kifer. 2013. Taming the infinite chase: Query answering under expressive relational constraints. J. Artif. Intell. Res. 48 (2013), 115--174.
[12]
Andrea Calì, Georg Gottlob, and Thomas Lukasiewicz. 2012. A general Datalog-based framework for tractable query answering over ontologies. J. Web Sem. 14 (2012), 57--83.
[13]
Andrea Calì, Georg Gottlob, and Andreas Pieris. 2012. Towards more expressive ontology languages: The query answering problem. Artif. Intell. 193 (2012), 87--128.
[14]
Andrea Calì, Domenico Lembo, and Riccardo Rosati. 2003. On the decidability and complexity of query answering over inconsistent and incomplete databases. In Proceedings of the PODS. 260--271.
[15]
Diego Calvanese, Giuseppe De Giacomo, and Maurizio Lenzerini. 2008. Conjunctive query containment and answering under description logic constraints. ACM Trans. Comput. Logic 9, 3 (2008).
[16]
Ashok K. Chandra and Philip M. Merlin. 1977. Optimal implementation of conjunctive queries in relational data bases. In Proceedings of the STOC. 77--90.
[17]
Hubie Chen and Víctor Dalmau. 2005. Beyond hypertree width: Decomposition methods without decompositions. In Proceedings of the CP. 167--181.
[18]
Bruno Courcelle. 1989. The monadic second-order logic of graphs, II: Infinite graphs of bounded width. Math. Syst. Theory 21, 4 (1989), 187--221.
[19]
Víctor Dalmau, Phokion G. Kolaitis, and Moshe Y. Vardi. 2002. Constraint satisfaction, bounded treewidth, and finite-variable logics. In Proceedings of the CP. 310--326.
[20]
Alin Deutsch, Alan Nash, and Jeff B. Remmel. 2008. The chase revisisted. In Proceedings of the PODS. 149--158.
[21]
Ronald Fagin. 1981. A normal form for relational databases that is based on domians and keys. ACM Trans. Database Syst. 6, 3 (1981), 387--415.
[22]
Ronald Fagin, Phokion G. Kolaitis, Renée J. Miller, and Lucian Popa. 2005. Data exchange: Semantics and query answering. Theor. Comput. Sci. 336, 1 (2005), 89--124.
[23]
Diego Figueira. 2016. Semantically acyclic conjunctive queries under functional dependencies. In Proceedings of the LICS. 847--856.
[24]
Wolfgang Fischl, Georg Gottlob, and Reinhard Pichler. 2018. General and fractional hypertree decompositions: Hard and easy cases. In Proceedings of the PODS. 17--32.
[25]
Tomasz Gogacz and Jerzy Marcinkowski. 2017. Converging to the chase - A tool for finite controllability. J. Comput. Syst. Sci. 83, 1 (2017), 180--206.
[26]
Nathan Goodman and Oded Shmueli. 1982. Tree queries: A simple class of relational queries. ACM Trans. Database Syst. 7, 4 (1982), 653--677.
[27]
Georg Gottlob, Gianluigi Greco, Nicola Leone, and Francesco Scarcello. 2016. Hypertree decompositions: Questions and answers. In Proceedings of the PODS. 57--74.
[28]
Georg Gottlob, Gianluigi Greco, and Bruno Marnette. 2009. HyperConsistency width for constraint satisfaction: Algorithms and complexity results. In Graph Theory, Computational Intelligence and Thought. Springer, Berlin, 87--99.
[29]
Georg Gottlob, Nicola Leone, and Francesco Scarcello. 2002. Hypertree decompositions and tractable queries. J. Comput. Syst. Sci. 64, 3 (2002), 579--627.
[30]
Georg Gottlob, Zoltán Miklós, and Thomas Schwentick. 2009. Generalized hypertree decompositions: NP-hardness and tractable variants. J. ACM 56, 6 (2009).
[31]
Georg Gottlob, Giorgio Orsi, and Andreas Pieris. 2014. Query rewriting and optimization for ontological databases. ACM Trans. Database Syst. 39, 3 (2014), 25:1--25:46.
[32]
Pavol Hell and Jaroslav Nešetřil. 2004. Graphs and Homomorphisms. Oxford University Press.
[33]
David S. Johnson and Anthony C. Klug. 1984. Testing containment of conjunctive queries under functional and inclusion dependencies. J. Comput. Syst. Sci. 28, 1 (1984), 167--189.
[34]
Leonid Libkin. 2004. Elements of Finite Model Theory. Springer.
[35]
Thomas Lukasiewicz, Maria Vanina Martinez, Andreas Pieris, and Gerardo I. Simari. 2015. From classical to consistent query answering under existential rules. In Proceedings of the AAAI. 1546--1552.
[36]
David Maier, Alberto O. Mendelzon, and Yehoshua Sagiv. 1979. Testing implications of data dependencies. ACM Trans. Database Syst. 4, 4 (1979), 455--469.
[37]
Christos H. Papadimitriou and Mihalis Yannakakis. 1999. On the complexity of database queries. J. Comput. Syst. Sci. 58, 3 (1999), 407--427.
[38]
Riccardo Rosati. 2011. On the finite controllability of conjunctive query answering in databases under open-world assumption. J. Comput. Syst. Sci. 77, 3 (2011), 572--594.
[39]
Yehoshua Sagiv and Mihalis Yannakakis. 1980. Equivalences among relational expressions with the union and difference operators. J. ACM 27, 4 (1980), 633--655.
[40]
Detlef Seese. 1991. The structure of the models of decidable monadic theories of graphs. Ann. Pure Appl. Logic 53, 2 (1991), 169--195.
[41]
Robert Endre Tarjan and Mihalis Yannakakis. 1984. Simple linear-time algorithms to test chordality of graphs, test acyclicity of hypergraphs, and selectively reduce acyclic hypergraphs. SIAM J. Comput. 13, 3 (1984), 566--579.
[42]
Mihalis Yannakakis. 1981. Algorithms for acyclic database schemes. In Proceedings of the VLDB. 82--94.

Cited By

View all
  • (2022)Efficiently Enumerating Answers to Ontology-Mediated QueriesProceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3517804.3524166(277-289)Online publication date: 12-Jun-2022
  • (2022)The Complexity of Conjunctive Queries with Degree 2Proceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3517804.3524152(91-102)Online publication date: 12-Jun-2022
  • (2022)The Case of SPARQL UNION, FILTER and DISTINCTProceedings of the ACM Web Conference 202210.1145/3485447.3511992(1882-1892)Online publication date: 25-Apr-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Journal of the ACM
Journal of the ACM  Volume 67, Issue 6
December 2020
260 pages
ISSN:0004-5411
EISSN:1557-735X
DOI:10.1145/3428359
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 October 2020
Accepted: 01 September 2020
Revised: 01 September 2020
Received: 01 November 2018
Published in JACM Volume 67, Issue 6

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Conjunctive queries
  2. approximations
  3. equality-generating dependencies
  4. evaluation
  5. hyper treewidth
  6. optimization
  7. tuple-generating dependencies

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • Royal Society in the context of the project “RAISON DATA”
  • EPSRC Programme
  • Millennium Institute for Foundational Research on Data (IMFD Chile)
  • Fondecyt
  • EPSRC
  • ANR project QUID
  • ANR project DéLTA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)28
  • Downloads (Last 6 weeks)2
Reflects downloads up to 12 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Efficiently Enumerating Answers to Ontology-Mediated QueriesProceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3517804.3524166(277-289)Online publication date: 12-Jun-2022
  • (2022)The Complexity of Conjunctive Queries with Degree 2Proceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3517804.3524152(91-102)Online publication date: 12-Jun-2022
  • (2022)The Case of SPARQL UNION, FILTER and DISTINCTProceedings of the ACM Web Conference 202210.1145/3485447.3511992(1882-1892)Online publication date: 25-Apr-2022
  • (2020)The Limits of Efficiency for Open- and Closed-World Query Evaluation Under Guarded TGDsProceedings of the 39th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3375395.3387653(259-270)Online publication date: 11-Jun-2020

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media