Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/844380.844755guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Efficient Progressive Sampling for Association Rules

Published: 09 December 2002 Publication History

Abstract

In data mining, sampling has often been suggested as aneffective tool to reduce the size of the dataset operated atsome cost to accuracy. However, this loss to accuracy isoften difficult to measure and characterize since the exactnature of the learning curve (accuracy vs. sample size) isparameter and data dependent, i.e., we do not know aprioriwhat sample size is needed to achieve a desired accuracyon a particular dataset for a particular set of parameters.In this article we propose the use of progressive sampling todetermine the required sample size for association rule mining.We first show that a naive application of progressivesampling is not very efficient for association rule mining.We then present a refinement based on equivalence classes,that seems to work extremely well in practice and is able toconverge to the desired sample size very quickly and veryaccurately. An additional novelty of our approach is thedefinition of a support-sensitive, interactive measure of accuracyacross progressive samples.

Cited By

View all
  • (2021)Enabling Privacy-Preserving Rule Mining in Decentralized Social NetworksProceedings of the 16th International Conference on Availability, Reliability and Security10.1145/3465481.3465482(1-11)Online publication date: 17-Aug-2021
  • (2019)Efficient privacy-preserving recommendations based on social graphsProceedings of the 13th ACM Conference on Recommender Systems10.1145/3298689.3347013(78-86)Online publication date: 10-Sep-2019
  • (2017)Discovery of Frequent Itemsets through Randomized Sampling with Bernstein's InequalityProceedings of the 2017 International Conference on Data Mining, Communications and Information Technology10.1145/3089871.3089872(1-5)Online publication date: 25-May-2017
  • Show More Cited By

Index Terms

  1. Efficient Progressive Sampling for Association Rules

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    ICDM '02: Proceedings of the 2002 IEEE International Conference on Data Mining
    December 2002
    ISBN:0769517544

    Publisher

    IEEE Computer Society

    United States

    Publication History

    Published: 09 December 2002

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 09 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)Enabling Privacy-Preserving Rule Mining in Decentralized Social NetworksProceedings of the 16th International Conference on Availability, Reliability and Security10.1145/3465481.3465482(1-11)Online publication date: 17-Aug-2021
    • (2019)Efficient privacy-preserving recommendations based on social graphsProceedings of the 13th ACM Conference on Recommender Systems10.1145/3298689.3347013(78-86)Online publication date: 10-Sep-2019
    • (2017)Discovery of Frequent Itemsets through Randomized Sampling with Bernstein's InequalityProceedings of the 2017 International Conference on Data Mining, Communications and Information Technology10.1145/3089871.3089872(1-5)Online publication date: 25-May-2017
    • (2017)Efficient frequent itemsets mining through sampling and information granulationEngineering Applications of Artificial Intelligence10.1016/j.engappai.2017.07.01665:C(119-136)Online publication date: 1-Oct-2017
    • (2015)Mining Frequent Itemsets through Progressive Sampling with Rademacher AveragesProceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining10.1145/2783258.2783265(1005-1014)Online publication date: 10-Aug-2015
    • (2014)Efficient Discovery of Association Rules and Frequent Itemsets through Sampling with Tight Performance GuaranteesACM Transactions on Knowledge Discovery from Data10.1145/26295868:4(1-32)Online publication date: 29-Aug-2014
    • (2012)PARMAProceedings of the 21st ACM international conference on Information and knowledge management10.1145/2396761.2396776(85-94)Online publication date: 29-Oct-2012
    • (2012)Stratified k-means clustering over a deep web data sourceProceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining10.1145/2339530.2339705(1113-1121)Online publication date: 12-Aug-2012
    • (2010)Discovery of frequent patterns in transactional data streamsTransactions on large-scale data- and knowledge-centered systems II10.5555/1986668.1986670(1-30)Online publication date: 1-Jan-2010
    • (2010)Discovery of frequent patterns in transactional data streamsTransactions on large-scale data- and knowledge-centered systems II10.5555/1980651.1980653(1-30)Online publication date: 1-Jan-2010
    • Show More Cited By

    View Options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media