Using quantitative information for efficient association rule generation

Authors:

B. Pôssas,

M. Carvalho,

R. Resende,

W. Meita, Jr.Authors Info & Claims

ACM SIGMOD Record, Volume 29, Issue 4

Pages 19 - 25

https://doi.org/10.1145/369275.369278

Published: 01 December 2000 Publication History

PDF eReader

Abstract

The problem of mining association rules in categorical data presented in customer transactions was introduced by Agrawal, Imielinski and Swami [2]. This seminal work gave birth to several investigation efforts [4, 13] resulting in descriptions of how to extend the original concepts and how to increase the performance of the related algorithms.The original problem of mining association rules was formulated as how to find rules of the form set₁ → set₂. This rule is supposed to denote affinity or correlation among the two sets containing nominal or ordinal data items. More specifically, such an association rule should translate the following meaning: customers that buy the products in set₁ also buy the products in set₂. Statistical basis is represented in the form of minimum support and confidence measures of these rules with respect to the set of customer transactions.The original problem as proposed by Agrawal et al. [2] was extended in several directions such as adding or replacing the confidence and support by other measures, or filtering the rules during or after generation, or including quantitative attributes. Srikant e Agrawal [16] describe an new approach where quantitative data can be treated as categorical. This is very important since otherwise part of the customer transaction information is discarded. Whenever an extension is proposed it must be checked in terms of its performance. The algorithm efficiency is linked to the size of the database that is amenable to be treated. Therefore it is crucial to have efficient algorithms that enable us to examine and extract valuable decision-making information in the ever larger databases.In this paper we present an algorithm that can be used in the context of several of the extensions provided in the literature but at the same time preserves its performance, as demonstrated in a case study. The approach in our algorithm is to explore multidimensional properties of the data (provided such properties are present), allowing us to combine this additional information in a very efficient pruning phase. This results in a very flexible and efficient algorithm that was used with success in several experiments using categorical and quantitative databases.The paper is organized as follows. In the next section we describe the quantitative association rules and we present an algorithm to generate it. Section 3 presents an optimization of the pruning phase of the Apriori [4] algorithm based on quantitative information associated with the items. Section 4 presents our experimental results for mining four synthetic workloads, followed by some related work in Section 5. Finally we present some conclusions and future work in Section 6.

References

[1]

{1} R. Agrawal, T. Imielinski, and A. Swami. Database mining: A performance perspective. In IEEE Transactions on Knowlegde and Data Engineering, December 1993.

Abstract

References

Cited By

Index Terms

Recommendations

Association rule mining and quantitative association rule mining among infrequent items

Association rule and quantitative association rule mining among infrequent items

Generalized association rule mining using an efficient data structure

Comments

Information

Published In

Publisher

Publication History

Check for updates

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

Login options

Full Access

Share

Share this Publication link

Share on social media

Affiliations