Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3294052.3319686acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
research-article
Public Access

Topology Dependent Bounds For FAQs

Published: 25 June 2019 Publication History

Abstract

In this paper, we prove topology dependent bounds on the number of rounds needed to compute Functional Aggregate Queries ($\FAQ$s) studied by Abo Khamis et al. [PODS 2016] in a synchronous distributed network under the model considered by Chattopadhyay et al. [FOCS 2014, SODA 2017]. Unlike the recent work on computing database queries in the Massively Parallel Computation model, in the model of Chattopadhyay et al., nodes can communicate only via private point-to-point channels and we are interested in bounds that work over an \em arbitrary communication topology. This model, which is closer to the well-studied $\congest$ model in distributed computing and generalizes Yao's two party communication complexity model, has so far only been studied for problems that are common in the two-party communication complexity literature. This is the first work to consider more practically motivated problems in this distributed model. For the sake of exposition, we focus on two specific problems in this paper: Boolean Conjunctive Query ($\BCQ$) and computing variable/factor marginals in Probabilistic Graphical Models (PGMs). We obtain tight bounds on the number of rounds needed to compute such queries as long as the underlying hypergraph of the query is $O(1)$-degenerate and has $O(1)$-arity. In particular, the $O(1)$-degeneracy condition covers most well-studied queries that are efficiently computable in the centralized computation model like queries with constant treewidth. These tight bounds depend on a new notion of 'width' (namely \em internal-node-width ) for Generalized Hypertree Decompositions (GHDs) of acyclic hypergraphs, which minimizes the number of internal nodes in a sub-class of GHDs. To the best of our knowledge, this width has not been studied explicitly in the theoretical database literature. Finally, we consider the problem of computing the product of a vector with a chain of matrices and prove tight bounds on its round complexity (over a finite field of two elements) using a novel min-entropy based argument.

References

[1]
Abu-Elkheir, M., Hayajneh, M., and Ali, N. A. Data management for the internet of things: Design primitives and solution. Sensors 13, 11 (2013), 15582--15612.
[2]
Afrati, F. N., Joglekar, M. R., Ré, C., Salihoglu, S., and Ullman, J. D. GYM: A multiround distributed join algorithm. In ICDT (2017), pp. 4:1--4:18.
[3]
Aji, S. M., and McEliece, R. J. The generalized distributive law. IEEE Transactions on Information Theory 46, 2 (Mar 2000), 325--343.
[4]
Akatov, D. Exploiting parallelism in decomposition methods for constraint satisfaction. PhD thesis, University of Oxford, UK, 2010.
[5]
Alon, N., Hoory, S., and Linial, N. The moore bound for irregular graphs. Graphs and Combinatorics 18, 1 (Mar 2002), 53--57.
[6]
Alon, N., and Spencer, J. The Probabilistic Method. John Wiley, 1992.
[7]
Bakibayev, N., Kociský, T., Olteanu, D., and Zavodny, J. Aggregation and ordering in factorised databases. PVLDB 6, 14 (2013), 1990--2001.
[8]
Bar-Yossef, Z., Jayram, T. S., Kumar, R., and Sivakumar, D. An information statistics approach to data stream and communication complexity. J. Comput. Syst. Sci. 68, 4 (2004), 702--732.
[9]
Beame, P., Koutris, P., and Suciu, D. Communication steps for parallel query processing. In PODS (2013), pp. 273--284.
[10]
Beame, P., Koutris, P., and Suciu, D. Skew in parallel query processing. In PODS (2014), pp. 212--223.
[11]
Bellenbaum, P., and Diestel, R. Two short proofs concerning tree-decompositions. Combinatorics, Probability & Computing 11, 6 (2002), 541--547.
[12]
Bodlaender, H. L. Nc-algorithms for graphs with small treewidth. In Graph-Theoretic Concepts in Computer Science, 14th International Workshop, WG '88, Amsterdam, The Netherlands, June 15--17, 1988, Proceedings (1988), pp. 1--10.
[13]
Bonnet, P., Gehrke, J., and Seshadri, P. Towards sensor database systems. In Mobile Data Management, Second International Conference, MDM 2001, Hong Kong, China, January 8--10, 2001, Proceedings (2001), pp. 3--14.
[14]
Braverman, M., Ellen, F., Oshman, R., Pitassi, T., and Vaikuntanathan, V. A tight bound for set disjointness in the message-passing model. In FOCS (2013), pp. 668--677.
[15]
Chakrabarti, A., Shi, Y., Wirth, A., and Yao, A. C. Informational complexity and the direct sum problem for simultaneous message complexity. In FOCS (2001), pp. 270--278.
[16]
Chattopadhyay, A. Personal communication, 2018.
[17]
Chattopadhyay, A., Koucký, M., Loff, B., and Mukhopadhyay, S. Simulation beats richness: New data-structure lower bounds. In STOC (2018).
[18]
Chattopadhyay, A., Langberg, M., Li, S., and Rudra, A. Tight network topology dependent bounds on rounds of communication. In SODA (2017), pp. 2524--2539.
[19]
Chattopadhyay, A., Radhakrishnan, J., and Rudra, A. Topology matters in communication. In FOCS (2014), pp. 631--640.
[20]
Chattopadhyay, A., and Rudra, A. The range of topological effects on communication. In ICALP (2015), pp. 540--551.
[21]
Dodis, Y., and Oliveira, R. On extracting private randomness over a public channel. In RANDOM (2003), pp. 252--263.
[22]
Erde, J. A unified treatment of linked and lean tree-decompositions. J. Comb. Theory, Ser. B 130 (2018), 114--143.
[23]
Gehrke, J., and Madden, S. Query processing in sensor networks. IEEE Pervasive Computing 3, 1 (2004), 46--55.
[24]
Ghobadi, M., Mahajan, R., Phanishayee, A., Devanur, N. R., Kulkarni, J., Ranade, G., Blanche, P., Rastegarfar, H., Glick, M., and Kilper, D. C. ProjecToR: Agile reconfigurable data center interconnect. In SIGCOMM (2016), pp. 216--229.
[25]
Gö ö s, M., Lovett, S., Meka, R., Watson, T., and Zuckerman, D. Rectangles are nonnegative juntas. In STOC (2015), pp. 257--266.
[26]
Gö ö s, M., Pitassi, T., and Watson, T. Query-to-communication lifting for BPP. In FOCS (2017), pp. 132--143.
[27]
Gottlob, G., Greco, G., Leone, N., and Scarcello, F. Hypertree decompositions: Questions and answers. In PODS (2016), pp. 57--74.
[28]
Graham, M. H. On the universal relation. In Tech Report (1979).
[29]
Greco, G., Leone, N., and Scarcello, F. On weighted hypertree decompositions. In Proceedings of the Twelfth Italian Symposium on Advanced Database Systems, SEBD 2004, S. Margherita di Pula, Cagliari, Italy, June 21--23, 2004 (2004), pp. 54--61.
[30]
Jayram, T. S., Kumar, R., and Sivakumar, D. Two applications of information complexity. In STOC (2003), pp. 673--682.
[31]
Joglekar, M., and Ré, C. It's all a matter of degree: Using degree information to optimize multiway joins. In ICDT (2016), pp. 11:1--11:17.
[32]
Khamis, M. A., Ngo, H. Q., and Rudra, A. FAQ: questions asked frequently. In PODS (2016), pp. 13--28.
[33]
Khamis, M. A., Ngo, H. Q., and Rudra, A. Juggling functions inside a database. SIGMOD Record 46, 1 (2017), 6--13.
[34]
Kossmann, D. The state of the art in distributed query processing. ACM Comput. Surv. 32, 4 (2000), 422--469.
[35]
Kostochka, A. V. On almost (k-1)-degenerate (k
[36]
1)-chromatic graphs and hypergraphs. Discrete Mathematics 313, 4 (2013), 366--374.
[37]
Koutris, P., Beame, P., and Suciu, D. Worst-case optimal algorithms for parallel query processing. In ICDT (2016), pp. 8:1--8:18.
[38]
Koutris, P., and Suciu, D. A guide to formal analysis of join processing in massively parallel systems. SIGMOD Record 45, 4 (2016), 18--27.
[39]
Langberg, M., Li, S., Jayaraman, S. V. M., and Rudra, A. Topology Dependent Bounds For FAQs. UB-CSE Tech Report 2019-01, 2019. https://cse.buffalo.edu/tech-reports/2019-01.pdf.
[40]
Lau, L. C. An approximate max-steiner-tree-packing min-steiner-cut theorem*. Combinatorica 27, 1 (2007), 71--90.
[41]
Madden, S., Franklin, M. J., Hellerstein, J. M., and Hong, W. Tinydb: an acquisitional query processing system for sensor networks. ACM Trans. Database Syst. 30, 1 (2005), 122--173.
[42]
Olteanu, D., and Zá vodný, J. Size bounds for factorised representations of query results. ACM Trans. Database Syst. 40, 1 (2015), 2:1--2:44.
[43]
Peleg, D. Distributed Computing: A Locality-Sensitive Approach.
[44]
Phillips, J. M., Verbin, E., and Zhang, Q. Lower bounds for number-in-hand multiparty communication complexity, made easy. In SODA (2012), pp. 486--501.
[45]
Renner, R., and Wolf, S. Simple and tight bounds for information reconciliation and privacy amplification. In ASIACRYPT (Berlin, Heidelberg, 2005), Springer-Verlag, pp. 199--216.
[46]
Scarcello, F., Greco, G., and Leone, N. Weighted hypertree decompositions and optimal query plans. In PODS (2004), pp. 210--221.
[47]
Tarjan, R. E., and Yannakakis, M. Simple linear-time algorithms to test chordality of graphs, test acyclicity of hypergraphs, and selectively reduce acyclic hypergraphs. SIAM J. Comput. 13, 3 (1984), 566--579.
[48]
Thomas, R. A menger-like property of tree-width: The finite case. J. Comb. Theory, Ser. B 48, 1 (1990), 67--76.
[49]
Tiwari, P. Lower bounds on communication complexity in distributed computer networks. J. ACM 34, 4 (1987), 921--938.
[50]
Vadhan, S. P. Pseudorandomness. Foundations and Trends in Theoretical Computer Science 7, 1--3 (2012), 1--336.
[51]
Woodruff, D., and Zhang, Q. Tight bounds for distributed functional monitoring. In STOC (2012), pp. 941--960.
[52]
Yu, C. T., and Ozsoyoglu, M. Z. An algorithm for tree-query membership of a distributed query. In The IEEE Computer Society's Third International Computer Software and Applications Conference, COMPSAC 1979, 6--8 November, 1979, Chicago, Illinois, USA (1979), pp. 306--312.
[53]
Zuckerman, D. Simulating BPP using a general weak random source. Algorithmica 16, 4/5 (1996), 367--391.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PODS '19: Proceedings of the 38th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems
June 2019
494 pages
ISBN:9781450362276
DOI:10.1145/3294052
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 June 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. boolean conjunctive query
  2. communication complexity lower bounds
  3. congest model
  4. probabalistic graphical models
  5. topology dependent bounds

Qualifiers

  • Research-article

Funding Sources

Conference

SIGMOD/PODS '19
Sponsor:
SIGMOD/PODS '19: International Conference on Management of Data
June 30 - July 5, 2019
Amsterdam, Netherlands

Acceptance Rates

PODS '19 Paper Acceptance Rate 29 of 87 submissions, 33%;
Overall Acceptance Rate 642 of 2,707 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 287
    Total Downloads
  • Downloads (Last 12 months)68
  • Downloads (Last 6 weeks)17
Reflects downloads up to 01 Feb 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media