Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Parallel Frequent Item Set Mining with Selective Item Replication

Published: 01 October 2011 Publication History

Abstract

We introduce a transaction database distribution scheme that divides the frequent item set mining task in a top-down fashion. Our method operates on a graph where vertices correspond to frequent items and edges correspond to frequent item sets of size two. We show that partitioning this graph by a vertex separator is sufficient to decide a distribution of the items such that the subdatabases determined by the item distribution can be mined independently. This distribution entails an amount of data replication, which may be reduced by setting appropriate weights to vertices. The data distribution scheme is used in the design of two new parallel frequent item set mining algorithms. Both algorithms replicate the items that correspond to the separator. NoClique replicates the work induced by the separator and NoClique2 computes the same work collectively. Computational load balancing and minimization of redundant or collective work may be achieved by assigning appropriate load estimates to vertices. The experiments show favorable speedups on a system with small-to-medium number of processors for synthetic and real-world databases.

Cited By

View all

Index Terms

  1. Parallel Frequent Item Set Mining with Selective Item Replication
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image IEEE Transactions on Parallel and Distributed Systems
    IEEE Transactions on Parallel and Distributed Systems  Volume 22, Issue 10
    October 2011
    176 pages

    Publisher

    IEEE Press

    Publication History

    Published: 01 October 2011

    Author Tags

    1. Parallel data mining
    2. frequent item set mining
    3. graph partitioning by vertex separator.
    4. mining methods and algorithms
    5. selective data replication

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 30 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)A Parallelized Frequent Temporal Pattern Mining Algorithm on a Time Series DatabaseIntelligent Information and Database Systems10.1007/978-3-030-73280-6_7(78-91)Online publication date: 7-Apr-2021
    • (2017)Frequent Itemset Mining in Large Datasets a SurveyInternational Journal of Information Retrieval Research10.4018/IJIRR.20171001037:4(37-49)Online publication date: 1-Oct-2017
    • (2016)Compressed Bitmaps Based Frequent Itemsets Mining on HadoopProceedings of the 10th International Conference on Informatics and Systems10.1145/2908446.2908457(159-165)Online publication date: 9-May-2016
    • (2016)A sparse memory allocation data structure for sequential and parallel association rule miningThe Journal of Supercomputing10.1007/s11227-015-1566-x72:2(347-370)Online publication date: 1-Feb-2016
    • (2015)A load balancing parallel method for frequent pattern mining on multi-core clusterProceedings of the Symposium on High Performance Computing10.5555/2872599.2872606(49-58)Online publication date: 12-Apr-2015
    • (2015)A distributed frequent itemset mining algorithm using Spark for Big Data analyticsCluster Computing10.1007/s10586-015-0477-118:4(1493-1501)Online publication date: 1-Dec-2015
    • (2014)Parallel Pre-processing for XML mining using Graphic ProcessorProceedings of the 2014 International Conference on Interdisciplinary Advances in Applied Computing10.1145/2660859.2660950(1-7)Online publication date: 10-Oct-2014
    • (2013)Efficient mining of frequent itemsets in social network data based on MapReduce frameworkProceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining10.1145/2492517.2500301(1183-1188)Online publication date: 25-Aug-2013
    • (2013)Parallel and Distributed Mining of Probabilistic Frequent Itemsets Using Multiple GPUsProceedings of the 24th International Conference on Database and Expert Systems Applications - Volume 805510.1007/978-3-642-40285-2_14(145-152)Online publication date: 26-Aug-2013
    • (2012)GPU acceleration of probabilistic frequent itemset mining from uncertain databasesProceedings of the 21st ACM international conference on Information and knowledge management10.1145/2396761.2396874(892-901)Online publication date: 29-Oct-2012
    • Show More Cited By

    View Options

    View options

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media