Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1014052.1014086acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
Article

Support envelopes: a technique for exploring the structure of association patterns

Published: 22 August 2004 Publication History

Abstract

This paper introduces support envelopes---a new tool for analyzing association patterns---and illustrates some of their properties, applications, and possible extensions. Specifically, the support envelope for a transaction data set and a specified pair of positive integers (m,n) consists of the items and transactions that need to be searched to find any association pattern involving m or more transactions and n or more items. For any transaction data set with M transactions and N items, there is a unique lattice of at most M*N support envelopes that captures the structure of the association patterns in that data set. Because support envelopes are not encumbered by a support threshold, this support lattice provides a complete view of the association structure of the data set, including association patterns that have low support. Furthermore, the boundary of the support lattice---the support boundary---has at most min(M,N) envelopes and is especially interesting since it bounds the maximum sizes of potential association patterns---not only for frequent, closed, and maximal itemsets, but also for patterns, such as error-tolerant itemsets, that are more general. The association structure can be represented graphically as a two-dimensional scatter plot of the (m,n) values associated with the support envelopes of the data set, a feature that is useful in the exploratory analysis of association patterns. Finally, the algorithm to compute support envelopes is simple and computationally efficient, and it is straightforward to parallelize the process of finding all the support envelopes.

References

[1]
Jean-Francois Boulicaut and Artur Bykowski. Frequent closures as a concise representation for binary data mining. In PAKDD 2000, pages 62--73, 2000.]]
[2]
J.F. Boulicaut, A. Bykowski, and C. Rigotti. Free-sets: a condensed representation of boolean data for the approximation of frequency queries. Data Mining and Knowledge Discovery Journal (DMKD), 7(1):5--22, 2003.]]
[3]
Laurentiu Cristofor and Dan A. Simovici. Generating an informative cover for association rules. In ICDM 2002, 9-12 December 2002, Maebashi City, Japan, pages 597--600. IEEE Computer Society, 2002.]]
[4]
B. A. Davey and H. A. Priestley. Introduction to Lattices and Order. Cambridge University Press, 2nd edition, 2002.]]
[5]
B. Ganter and R. Wille. Formal Concept Analysis -- Mathematical Foundations. Springer, May 1999.]]
[6]
Bart Goethals and Mohammed J. Zaki. Frequent Itemset Mining Implementations Repository (FIMI). This site contains a wide-variety of algorithms for mining frequent, closed, and maximal itemsets, http://fimi.cs.helsinki.fi/.]]
[7]
Dimitrios Gunopulos, Heikki Mannila, Roni Khardon, and Hannu Toivonen. Data mining, hypergraph transversals, and machine learning. In Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems, pages 209--216. ACM Press, 1997.]]
[8]
Jiawei Han and Micheline Kamber. Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, 2000.]]
[9]
Jochen Hipp, Ulrich Guntzer, and Gholamreza Nakhaeizadeh. Algorithms for association rule mining -- a general survey and comparison. SIGKDD Explorations, 2(1):58--64, July 2000.]]
[10]
Heikki Mannila and Hannu Toivonen. Multiple uses of frequent sets and condensed representations. In Knowledge Discovery and Data Mining, pages 189--194, 1996.]]
[11]
Heikki Mannila and Hannu Toivonen. Levelwise search and borders of theories in knowledge discovery. Data Mining and Knowledge Discovery, 1(3):241--258, 1997.]]
[12]
Nicolas Pasquier, Yves Bastide, Rafik Taouil, and Lotfi Lakhal. Closed set based discovery of small covers for association rules. In Proc. 15emes Journees Bases de Donnees Avancees, BDA, pages 361--381, 1999.]]
[13]
Nicolas Pasquier, Yves Bastide, Rafik Taouil, and Lotfi Lakhal. Efficient mining of association rules using closed itemset lattices. Information Systems, 24(1):25--46, 1999.]]
[14]
V. Pudi and J. Haritsa. Generalized closed itemsets: A technique for improving the conciseness of rule covers. In Proc. of 19th IEEE Intl. Conf. on Data Engineering, Bangalore, India, March 2003, pages 714--716. IEEE Computer Society, 2003.]]
[15]
Ganesh Ramesh, William A. Maniatty, and Mohammed J. Zaki. Feasible itemset distributions in data mining: theory and application. In Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages 284--295. ACM Press, 2003.]]
[16]
Michael Steinbach. Preliminary implementation of support envelope algorithms. http://www.cs.umn.edu/~steinbac/se/se.html.]]
[17]
Michael Steinbach Pang-Ning Tan and Vipin Kumar. Tr# 2004-115: Support envelopes: A technique for exploring the structure of association patterns. Technical report, Army High Performance Computing Research Center, April 2004.]]
[18]
H. Xiong, P. Tan, and V. Kumar. Mining strong affinity association patterns in data sets with skewed support distribution. In Proc. of the 3rd IEEE International Conf. on Data Mining, pages 387--394, 2003.]]
[19]
Cheng Yang, Usama M. Fayyad, and Paul S. Bradley. Efficient discovery of error-tolerant frequent itemsets in high dimensions. In KDD '01, pages 194--203. ACM Press, 2001.]]
[20]
Mohammed Javeed Zaki and Mitsunori Ogihara. Theoretical foundations of association rules. In DMKD 98, pages 7:1--7:8, June 1998.]]
[21]
Mohammed J. Zaki and Ching-Jui Hsiao. Charm: An efficient algorithm for closed itemset mining. In SDM 2002, 2002.]]
[22]
Text retrieval conference 5, http://trec.nist.gov/.]]

Cited By

View all
  • (2015)Mining Approximate Frequent Patterns from Noisy DatabasesProceedings of the 2015 10th International Conference on Broadband and Wireless Computing, Communication and Applications (BWCCA)10.1109/BWCCA.2015.29(400-403)Online publication date: 4-Nov-2015
  • (2015)Mining summarization of high utility itemsetsKnowledge-Based Systems10.1016/j.knosys.2015.04.00484:C(67-77)Online publication date: 1-Aug-2015
  • (2014)A Unifying Framework for Mining Approximate Top- $k$ Binary PatternsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2013.18126:12(2900-2913)Online publication date: Dec-2014
  • Show More Cited By

Index Terms

  1. Support envelopes: a technique for exploring the structure of association patterns

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '04: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
    August 2004
    874 pages
    ISBN:1581138881
    DOI:10.1145/1014052
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 22 August 2004

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. association analysis
    2. error-tolerant itemsets
    3. formal concept analysis
    4. support envelope

    Qualifiers

    • Article

    Conference

    KDD04

    Acceptance Rates

    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 03 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2015)Mining Approximate Frequent Patterns from Noisy DatabasesProceedings of the 2015 10th International Conference on Broadband and Wireless Computing, Communication and Applications (BWCCA)10.1109/BWCCA.2015.29(400-403)Online publication date: 4-Nov-2015
    • (2015)Mining summarization of high utility itemsetsKnowledge-Based Systems10.1016/j.knosys.2015.04.00484:C(67-77)Online publication date: 1-Aug-2015
    • (2014)A Unifying Framework for Mining Approximate Top- $k$ Binary PatternsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2013.18126:12(2900-2913)Online publication date: Dec-2014
    • (2013)Looking for a structural characterization of the sparseness measure of (frequent closed) itemset contextsInformation Sciences: an International Journal10.1016/j.ins.2012.08.005222(343-361)Online publication date: 1-Feb-2013
    • (2011)Summarizing transactional databases with overlapped hyperrectanglesData Mining and Knowledge Discovery10.1007/s10618-010-0203-923:2(215-251)Online publication date: 1-Sep-2011
    • (2009)An association analysis approach to biclusteringProceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining10.1145/1557019.1557095(677-686)Online publication date: 28-Jun-2009
    • (2008)Succinct summarization of transactional databasesProceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining10.1145/1401890.1401981(758-766)Online publication date: 24-Aug-2008
    • (2008)Quantitative evaluation of approximate frequent pattern mining algorithmsProceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining10.1145/1401890.1401930(301-309)Online publication date: 24-Aug-2008
    • (2008)Approximate Frequent Itemset Mining In the Presence of Random NoiseSoft Computing for Knowledge Discovery and Data Mining10.1007/978-0-387-69935-6_15(363-389)Online publication date: 2008
    • (2007)TwainACM Transactions on Knowledge Discovery from Data10.1145/1267066.12670691:2(8-es)Online publication date: 1-Aug-2007
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media