Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

On condensed representations of constrained frequent patterns

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Constrained frequent patterns and closed frequent patterns are two paradigms aimed at reducing the set of extracted patterns to a smaller, more interesting, subset. Although a lot of work has been done with both these paradigms, there is still confusion around the mining problem obtained by joining closed and constrained frequent patterns in a unique framework. In this paper, we shed light on this problem by providing a formal definition and a thorough characterisation. We also study computational issues and show how to combine the most recent results in both paradigms, providing a very efficient algorithm that exploits the two requirements (satisfying constraints and being closed) together at mining time in order to reduce the computation as much as possible.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Agrawal R, Imielinski T, Swami A (1993) Mining associations between sets of items in massive databases. In: Proceedings of the ACM SIGMOD international conference on management of data, pp 207–216. Washington, DC

  2. Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th international conference on very large databases, pp 487–499. Santiago de Chile, Chile

  3. Bayardo R (1998) Efficiently mining long patterns from databases. In: Proceedings of the ACM SIGMOD international conference on management of data, pp 85–93. Seattle, Washington

  4. Bonchi F, Giannotti F, Mazzanti A, Pedreschi D (2003) Adaptive constraint pushing in frequent pattern mining. In: Proceedings of the 7th European conference on principles and practice of knowledge discovery in databases. Cavtat-Dubrovnik, Croatia. Lecture notes in computer science, 2838, Springer, Berlin Heidelberg New York

  5. Bonchi F, Giannotti F, Mazzanti A, Pedreschi D. (2003) ExAnte: anticipated data reduction in constrained pattern mining. In: Proceedings of the 7th European conference on principles and practice of knowledge discovery in databases, Cavtat-Dubrovnik, Croatia. Lecture notes in computer science, 2838. Springer, Berlin Heidelberg New York

  6. Bonchi, F., Giannotti, F., Mazzanti, A., Pedreschi, D.: ExAMiner: optimised level-wise frequent pattern mining with monotone constraints. In: Proceedings of the third IEEE international conference on data mining, ICDM '03, pp. 11–18. Melbourne, Florida, USA (2003)

  7. Bonchi F, Goethals B (2004) FP-Bonsai: the art of growing and pruning small FP-trees. In: Proceedings of advances in knowledge discovery and data mining, 8th Pacific-Asia conference, PAKDD 2004, pp 155–160. Sydney, Australia

  8. Boulicaut JF, Jeudy B (2001) Mining free itemsets under constraints. In: Proceedings of international database engineering and applications symposium, IDEAS '01, pp 322-329. Grenoble, France

  9. Boulicaut JF, Jeudy B (2002) Optimization of association rule mining queries. Intel Data Anal J 6:341–357

    Google Scholar 

  10. Bucila C, Gehrke J, Kifer D, White W (2002) DualMiner: a dual-pruning algorithm for itemsets with constraints. In: Proceedings of the 8th ACM SIGKDD international conference on knowledge discovery and data mining, pp. 42–51. Edmonton, Alberta, Canada

  11. De Raedt L, Kramer S (2001) The levelwise version space algorithm and its application to molecular fragment finding. In: Proceedings of the 17th international joint conference on artificial intelligence, pp 853–862. Seattle, Washington

  12. Grahne G, Lakshmanan L, Wang X (2000) Efficient mining of constrained correlated sets. In: Proceedings of the 16th IEEE international conference on data engineering, pp. 512–524. San Diego, California

  13. Han J, Lakshmanan L, Ng R (1999) Constraint-based, multidimensional data mining. Computer 32(8):46–50

    Google Scholar 

  14. Han J, Pei J, Yin Y: Mining frequent patterns without candidate generation. In: Proceedings of 2000 ACM SIGMOD international conference on management of data, pp 1-12. Dallas, Texas

  15. Jia L, Pei R, Pei D (2003) Tough constraint-based frequent closed itemsets mining. In: Proceedings of the 2003 ACM symposium on applied computing (SAC), pp. 416–420. Melbourne, FL

  16. Lakshmanan L, Ng R, Han J, Pang A (1999) Optimization of constrained frequent set queries with 2-variable constraints. In: Proceedings ACM SIGMOD international conference on management of data, pp 157–168. Philadelphia, Pennsylvania

  17. Lucchese C, Orlando S, Perego R (2004) DCI_Closed: a fast and memory efficient algorithm to mine frequent closed itemsets. In: Proceedings of the IEEE ICDM workshop on frequent itemset mining implementations. Brighton, UK

  18. Mannila H, Toivonen H (1996) Multiple uses of frequent sets and condensed representations: extended abstract. In: Proceedings of the 2nd international conference on knowledge discovery and data mining (KDD-96), pp 189–194

  19. Ng R, Lakshmanan L, Han J, Pang A (1998) Exploratory mining and pruning optimizations of constrained associations rules. In: Proceedings ACM SIGMOD international conference on management of data, pp 13–24. Seattle, Washington

  20. Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Discovering frequent closed itemsets for association rules. In: Proceedings of the 7th international conference on database theory—ICDT '99, Jerusalem, Israel, 10–12 January 1999. Lecture notes in computer science, 1540, pp 398–416. Springer, Berlin Heidelberg New York

  21. Pei J, Han J (2000) Can we push more constraints into frequent pattern mining? In: Proceedings ACM SIGKDD international conference on knowledge discovery and data mining, pp. 350-354, Boston, Massachusetts

  22. Pei J, Han J, Lakshmanan L (2001) Mining frequent item sets with convertible constraints. In: Proceedings of the 17th IEEE international conference on data engineering, pp 433–442. Heidelberg, Germany

  23. Pei J, Han J, Mao R (2000) CLOSET: an efficient algorithm for mining frequent closed itemsets. In: Proceedings of the ACM SIGMOD workshop on research issues in data mining and knowledge discovery (DMKD), pp 21–30

  24. ei J, Han J, Wang J (2003) CLOSET+: searching for the best strategies for mining frequent closed itemsets. In: Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining, Washington, DC

  25. Srikant R, Vu Q, Agrawal R (1997) Mining association rules with item constraints. In: Proceedings ACM SIGKDD international conference on knowledge discovery and data mining, pp 67-73. Newport Beach, California

  26. Zaki MJ, Hsiao CJ (2002) CHARM: an efficient algorithm for closed itemsets mining. In: Proceedings of the 2nd SIAM international conference on data mining, Arlington, Virginia

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francesco Bonchi.

Additional information

Francesco Bonchi received his Ph.D. in computer science from the University of Pisa in December 2003, with the thesis “Frequent Pattern Queries: Language and Optimizations”. Currently, he is a postdoc at the Institute of Information Science and Technologies (ISTI) of the Italian National Research Council in Pisa, where he is a member of the Knowledge Discovery and Delivery Laboratory. He has been a visiting fellow at the Kanwal Rekhi School of Information Technology, Indian Institute of Technology, Bombay (2000, 2001). His current research interests are data mining query language and Optimization, frequent pattern mining, privacy-preserving data mining, bioinformatics. He is one of the teachers of a course on data mining held at the faculty of Economics at the University of Pisa. He served as a referee at various national and international conferences on databases, data mining, logic programming and artificial intelligence.

Claudio Lucchese received the Master Degree in Computer Science summa cum laude from Ca' Foscari University of Venice in October 2003. He is currently a Ph.D. student at the same university and Research Associate at the Institute of Information Science and Technologies (ISTI) of the Italian National Research Council in Pisa, where he is a member of the High Performance Computing Laboratory. He is mainly interested in frequent pattern mining, privacy-preserving data mining, and data mining techniques for information retrieval.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bonchi, F., Lucchese, C. On condensed representations of constrained frequent patterns. Knowl Inf Syst 9, 180–201 (2006). https://doi.org/10.1007/s10115-005-0201-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-005-0201-1

Keywords