Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Computational complexity of queries based on itemsets

Published: 15 June 2006 Publication History

Abstract

We investigate determining the exact bounds of the frequencies of conjunctions based on frequent sets. Our scenario is an important special case of some general probabilistic logic problems that are known to be intractable. We show that despite the limitations our problems are also intractable, namely, we show that checking whether the maximal consistent frequency of a query is larger than a given threshold is NP-complete and that evaluating the Maximum Entropy estimate of a query is PP-hard. We also prove that checking consistency is NP-complete.

References

[1]
R. Agrawal, T. Imielinski, A.N. Swami, Mining association rules between sets of items in large databases, in: P. Buneman, S. Jajodia (Eds.), Proc. 1993 ACM SIGMOD Internat. Conf. on Management of Data, Washington, DC, 26--28 May, 1993, pp. 207--216
[2]
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H. and Verkamo, A.I., Fast discovery of association rules. In: Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (Eds.), Advances in Knowledge Discovery and Data Mining, AAAI Press/The MIT Press, Cambridge, MA. pp. 307-328.
[3]
D.D. Bailey, V. Dalmau, P.G. Kolaitis, Phase transitions of PP-complete satisfiability problems, in: IJCAI, 2001, pp. 183--192
[4]
Bykowski, A., Seppänen, J.K. and Hollmén, J., Model-independent bounding of the supports of Boolean formulae in binary data. In: Lanzi, P.L., Meo, R. (Eds.), Database Technologies for Data Mining, Springer-Verlag, Berlin.
[5]
T. Calders, Axiomatization and deduction rules for the frequency of itemsets, PhD thesis, University of Antwerp, Belgium, 2003
[6]
T. Calders, Computational complexity of itemset frequency satisfiability, in: Proc. 23nd ACM SIGMOD-SIGACT-SIGART Symp. on Principles of Database System, 2004
[7]
Cooper, G., The computational complexity of probabilistic inference using Bayesian belief networks. Artificial Intelligence. v42 i2--3. 393-405.
[8]
Csiszár, I., I-divergence geometry of probability distributions and minimization problems. Ann. Probab. v3 i1. 146-158.
[9]
Georgakopoulos, G., Kavvadias, D. and Papadimitriou, C.H., Probabilistic satisfiability. J. Complexity. v4 i1. 1-11.
[10]
Hailperin, T., Best possible inequalities for the probability of a logical function of events. Amer. Math. Monthly. v72 i4. 343-359.
[11]
Kullback, S., Information Theory and Statistics. 1968. Dover Publications, Inc., New York.
[12]
Lukasiewicz, T., Probabilistic logic programming with conditional constraints. ACM Trans. Comput. Logic (TOCL). v2 i3. 289-339.
[13]
Papadimitriou, C., Computational Complexity. 1995. Addison-Wesley, Reading, MA.
[14]
Papadimitriou, C. and Steiglitz, K., Combinatorial Optimization Algorithms and Complexity. 1998. second ed. Dover, New York.
[15]
Pavlov, D., Mannila, H. and Smyth, P., Beyond independence: Probabilistic models for query approximation on binary transaction data. IEEE Trans. Knowledge Data Engrg. v15 i6. 1409-1421.

Cited By

View all
  • (2022)Discovering Significant Patterns under Sequential False Discovery ControlProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3534678.3539398(263-272)Online publication date: 14-Aug-2022
  • (2018)Generating Realistic Synthetic Population DatasetsACM Transactions on Knowledge Discovery from Data10.1145/318238312:4(1-22)Online publication date: 16-Apr-2018
  • (2018)Interactive Discovery of Coordinated Relationship Chains with Maximum Entropy ModelsACM Transactions on Knowledge Discovery from Data10.1145/304701712:1(1-34)Online publication date: 31-Jan-2018
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Information Processing Letters
Information Processing Letters  Volume 98, Issue 5
15 June 2006
59 pages

Publisher

Elsevier North-Holland, Inc.

United States

Publication History

Published: 15 June 2006

Author Tags

  1. Computational complexity
  2. Data mining
  3. Itemset

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 24 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Discovering Significant Patterns under Sequential False Discovery ControlProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3534678.3539398(263-272)Online publication date: 14-Aug-2022
  • (2018)Generating Realistic Synthetic Population DatasetsACM Transactions on Knowledge Discovery from Data10.1145/318238312:4(1-22)Online publication date: 16-Apr-2018
  • (2018)Interactive Discovery of Coordinated Relationship Chains with Maximum Entropy ModelsACM Transactions on Knowledge Discovery from Data10.1145/304701712:1(1-34)Online publication date: 31-Jan-2018
  • (2018)Uncovering the plotData Mining and Knowledge Discovery10.1007/s10618-014-0370-128:5-6(1398-1428)Online publication date: 26-Dec-2018
  • (2018)Comparing apples and orangesData Mining and Knowledge Discovery10.1007/s10618-012-0275-925:2(173-207)Online publication date: 26-Dec-2018
  • (2018)From sets of good redescriptions to good sets of redescriptionsKnowledge and Information Systems10.1007/s10115-017-1149-757:1(21-54)Online publication date: 1-Oct-2018
  • (2012)Summarizing data succinctly with the most informative itemsetsACM Transactions on Knowledge Discovery from Data (TKDD)10.1145/2382577.23825806:4(1-42)Online publication date: 18-Dec-2012
  • (2011)Comparing apples and orangesProceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III10.5555/2034161.2034188(398-413)Online publication date: 5-Sep-2011
  • (2011)Tell me what i need to knowProceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining10.1145/2020408.2020499(573-581)Online publication date: 21-Aug-2011
  • (2011)Comparing apples and oranges measuring differences between data mining resultsProceedings of the 2011th European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part III10.1007/978-3-642-23808-6_26(398-413)Online publication date: 5-Sep-2011
  • Show More Cited By

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media