Abstract
Mining association rules is a popular and well researched method for discovering interesting relations between variables in large databases. A practical problem is that at medium to low support values often a large number of frequent itemsets and an even larger number of association rules are found in a database. A widely used approach is to gradually increase minimum support and minimum confidence or to filter the found rules using increasingly strict constraints on additional measures of interestingness until the set of rules found is reduced to a manageable size. In this paper we describe a different approach which is based on the idea to first define a set of “interesting” itemsets (e.g., by a mixture of mining and expert knowledge) and then, in a second step to selectively generate rules for only these itemsets. The main advantage of this approach over increasing thresholds or filtering rules is that the number of rules found is significantly reduced while at the same time it is not necessary to increase the support and confidence thresholds which might lead to missing important information in the database.
Similar content being viewed by others
References
Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD international conference on management of data. ACM Press, pp 207–216
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Bocca JB, Jarke M, Zaniolo C (eds) Proceedings of the 20th international conference on very large data bases, VLDB. Morgan Kaufmann, pp 487–499
Bayardo RJ, Agrawal R, Gunopulos D (2000) Constraint-based rule mining in large, dense databases. Data Mining Knowled Discov 4(2/3):217–240
Borgelt C (2003) Efficient implementations of Apriori and Eclat. In: FIMI’03: Proceedings of the IEEE ICDM workshop on frequent itemset mining implementations
Borgelt C (2006) Apriori—Association rule induction, School of Computer Science, Otto-von-Guericke-University of Magdeburg. http://fuzzy.cs.uni-magdeburg.de/~borgelt/apriori.html
Borgelt C, Kruse R (2002) Induction of association rules: Apriori implementation. In: Proceedings of the 15th conference on computational statistics (Compstat 2002, Berlin, Germany). Physika Verlag, Heidelberg
Creighton C, Hanash S (2003) Mining gene expression databases for association rules. Bioinformatics 19(1):79–86
Goethals B, Zaki MJ (2004) Advances in frequent itemset mining implementations: Report on FIMI’03. SIGKDD Explorations 6(1):109–117
Hahsler M, Buchta C, Grün B, Hornik K (2007) arules: Mining Association Rules and Frequent Itemsets. R package version 0.6-0. http://CRAN.R-project.org/
Hipp J, Güntzer U, Nakhaeizadeh G (2000) Algorithms for association rule mining—a general survey and comparison. SIGKDD Explorations 2(2):1–58
Imielinski T, Virmani A (1998) Association rules... and what’s next? towards second generation data mining systems. In: Proceedings of the second East European symposium on advances in databases and information systems. Lecture notes in computer science, vol 1475. Springer, London, pp 6–25
Klemettinen M, Mannila H, Ronkainen P, Toivonen H, Verkamo AI (1994) Finding interesting rules from large sets of discovered association rules. In: Adam NR, Bhargava BK, Yesha Y (eds) Third international conference on information and knowledge management (CIKM’94). ACM Press, pp 401–407
Knuth D (1997) The art of computer programming, sorting and searching, vol 3, 3rd edn. Digital searching, pp 492–512
Kohavi R (1996) Scaling up the accuracy of Naïve–Bayes classifiers: a decision-tree hybrid. In: Proceedings of the second international conference on knowledge discovery and data mining, pp 202–207
Kohavi R, Brodley C, Frasca B, Mason L, Zheng Z (2000) KDD-Cup 2000 organizers report: peeling the onion. SIGKDD Explorat 2(2):86–98
Luo J, Bridges S (2000) Mining fuzzy association rules and fuzzy frequency episodes for intrusion detection. Int J Intell Syst 15(8):687–703
Newman DJ, Hettich S, Blake CL, Merz CJ (1998) UCI Repository of Machine Learning Databases, University of California, Irvine, Department of Information and Computer Sciences. http://www.ics.uci.edu/~mlearn/MLRepository.html
Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Discovering frequent closed itemsets for association rules. In: Proceeding of the 7th international conference on database theory. Lecture notes in computer science (LNCS 1540). Springer, Heidelberg, pp 398–416
Piatetsky-Shapiro G (1991) Discovery, analysis, and presentation of strong rules. In: Piatetsky-Shapiro G, Frawley WJ (eds). Knowledge discovery in databases. AAAI/MIT Press, Cambridge, MA
Srikant R, Vu Q, Agrawal R (1997) Mining association rules with item constraints. In: Heckerman D, Mannila H, Pregibon D, Uthurusamy R (eds) Proceedings of the 3rd international conference on knowledge discovery and data mining, KDD. AAAI Press, pp 67–73
Srivastava J, Cooley R, Deshpande M, Tan P-N (2000) Web usage mining: discovery and applications of usage patterns from web data. SIGKDD Explorat 1(2):12–23
Tan P-N, Kumar V, Srivastava J (2004) Selecting the right objective measure for association analysis. Inf Syst 29(4):293–313
Zaki MJ (2004) Mining non-redundant association rules. Data Mining Knowled Discov 9:223–248
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hahsler, M., Buchta, C. & Hornik, K. Selective association rule generation. Computational Statistics 23, 303–315 (2008). https://doi.org/10.1007/s00180-007-0062-z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-007-0062-z