Data mining is an essential technique in knowledge discovery which is widely used for pattern extraction and information classification. Extracting useful rules and knowledge by considering the relationships and association of the data is as an important data mining technique used for data analysis, called association rule mining (ARM). Several scans of the dataset are necessary to extract frequent patterns and association rules during a time-consuming process. Discovery of frequent patterns within data is the major phase of the ARM process, which is very expensive in terms of execution times. Powerful parallel systems with multiple graphics processing units (GPUs) and multiple general-purpose graphics processing units (GPGPUs) are appropriate choices to reduce the execution time. Although GPU architectures can speed up the mining process, a single GPU is usually unable to use a large amount of data to extract frequent patterns. It is therefore necessary to use multiple GPU processors on a system or distribute them within a network to improve the efficiency of parallelization. In this paper, multiple GPUs are parallelized to propose a new framework, called GPApbmp, for parallelization of the Apriori algorithm, which is a well-known level-wise frequent pattern mining method, for faster extraction of association rules. The proposed framework uses multiple GPUs, on which the dataset is distributed to reduce the execution time and the number of database scans in the Apriori method using a vertical approach. The experimental results on standard datasets show that the proposed method reduces the execution time speeds up the mining process. The results obtained from two and four parallelized NVidia GeForce 710 processors evaluated in CUDA.
Abdelaal AA, Abed S, Al-Shayeji M, Allaho M (2021) Customized frequent patterns mining algorithms for enhanced top-rank-K frequent pattern mining. Expert Syst Appl 169:114530
Agrawal, R., Srikant, R. (1994) Fast algorithms for mining association rules. In proc. 1994 Int. Conf. Very Large data bases (VLDB’94), 487–499.
Agrawal, R., Imielinski, T., Swami, A. (1993) Mining association rules between sets of items in large databases. In: proceedings of the 1993ACM-SIGMOD international conference on management of data (SIGMOD’93), 207–216.
Ahamed AKC, Magoules F (2017) Conjugate gradient method with graphics processing unit acceleration: CUDA vs OpenCL. Adv Eng Softw 111:32–42
Bagnall A, Lines J, Bostrom A, Large J, Keogh E (2017) The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Min Knowl Disc 31(3):606–660
Baralis E, Cerquitelli T, Chiusano S (2009) IMine: index support for item set mining. IEEE Trans Knowl Data Eng 21(4):493–506
Bustio-Martínez L, Cumplido R, Letras M, Hernández-León R, Feregrino-Uribe C, Hernández-Palancar J (2021) FPGA/GPU-based acceleration for frequent Itemsets mining: a comprehensive review. ACM Comput Surv (CSUR) 54(9):1–35
Cheng J, Grossman M, & McKercher, T. (2014) Professional Cuda C programming. John Wiley & Sons
Chengyan L, FENG S, Sun G DCE -miner: an association rule mining algorithm for multimedia based on the MapReduce framework. Multimed Tools Appl 79:16771–16793
Chon KW, Hwang SH, Kim MS (2018) GMiner: a fast GPU-based frequent itemset mining method for large-scale data. Inf Sci 439:19–38
D’Angelo G, Rampone S, Palmieri F (2017) Developing a trust model for pervasive computing based on Apriori association rules learning and Bayesian classification. Soft Comput 21(21):6297–6315
Davashi R (2021) ILUNA: single-pass incremental method for uncertain frequent pattern mining without false positives. Inf Sci 564:1–26
Deng H, Lv SL (2014) Fast mining frequent itemsets using Nodesets. Expert Syst Appl 41(10):4505–4512
Deng H, Lv SL (2015) PrePost+: an efficient N-lists-based algorithm for mining frequent itemsets via children–parent equivalence pruning. Expert Syst Appl 42(13):5424–5432
Deng ZH (2016) DiffNodesets: an efficient structure for fast mining frequent itemsets. Appl Soft Comput 41:214–223
Deng ZH, Wang ZH (2010) A new fast vertical method for mining frequent itemsets. Int J Comput Intell Syst 3(6):733–744
Deng ZH, Wang ZH, Jiang JJ (2012) A new algorithm for fast mining frequent itemsets using n-lists. SCIENCE CHINA Inf Sci 55(9):2008–2030
Djenouri Y, Comuzzi M (2017) Combining Apriori heuristic and bio-inspired algorithms for solving the frequent itemsets mining problem. Inf Sci 420:1–15
Djenouri Y, AhceneBendjoudi MM, Nouali-Taboudjemat N and ZinebHabbas (2014) "Parallel association rules mining using GPUS and bees behaviors." In 2014 6th International Conference of Soft Computing and Pattern Recognition (SoCPaR), pp. 401–405. IEEE.
Djenouri Y, AhceneBendjoudi, Mehdi M, Nouali-Taboudjemat N, ZinebHabbas (2015) GPU-based bees swarm optimization for association rules mining. J Supercomp 71(4):1318–1344
Djenouri Y, AhceneBendjoudi, DjamelDjenouri, and Comuzzi M (2017) "GPU-based bio-inspired model for solving association rules mining problem." In 2017 25th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp. 262–269. IEEE
Djenouri Y, AsmaBelhadi, Fournier-Viger P, and Lin JC-W (2017) "An hybrid multi-core/gpu-based mimetic algorithm for big association rule mining." In International Conference on Genetic and Evolutionary Computing, pp. 59–65. Springer, Singapore
Djenouri Y, Fournier-Viger P, Lin JCW, Djenouri D, Belhadi A (2019) GPU-based swarm intelligence for association rule mining in big databases. Intelligent Data Analysis 23(1):57–76
Djenouri Y, DjamelDjenouri AB, Cano A (2019) Exploiting GPU and cluster parallelism in single scan frequent itemset mining. Inf Sci 496:363–377
Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: proceeding of the 2000 ACM-SIGMOD international conference on management of data (SIGMOD’00), 1–12.
Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Disc 8(1):53–87
Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Elsevier
Hosseinabady M, Zainol MAB, Nunez-Yanez J (2019) Heterogeneous FPGA+ GPU embedded systems: challenges and opportunities. arXiv preprint arXiv:1901.06331.
Hung CL, Lin YS, Lin CY, Chung YC, Chung YF (2015) CUDAClustalW: an efficient parallel algorithm for progressive multiple sequence alignment on multi-GPUs. Comput Biol Chem 58:62–68
Jiang H, Xu CW, Liu ZY, Yu LY (2017) GPU-accelerated Apriori algorithm. In ITM web of conferences (Vol. 12, p. 03046). EDP sciences.
Kalaiselvi T, Sriramakrishnan P, Somasundaram K (2017) Survey of using GPUCUDA programming model in medical image analysis. Inform Med Unlocked 9:133–144
Kalivarapu V, Winer E (2015) A study of graphics hardware accelerated particle swarm optimization with digital pheromones. Struct Multidiscip Optim 51(6):1281–1304
Kalra, M., Lal, N., & Qamar, S. (2018). K-mean clustering algorithm approach for data mining of heterogeneous data. In information and communication Technology for Sustainable Development (pp. 61–70). Springer, Singapore, K-Mean Clustering Algorithm Approach for Data Mining of Heterogeneous Data.
Kavakiotis I, Tsave O, Salifoglou A, Maglaveras N, Vlahavas I, Chouvarda I (2017) Machine learning and data mining methods in diabetes research. Comput struct Biotechnol J 15:104–116
Lee H, Shao B, Kang U (2015) Fast graph mining with HBase. Inf Sci 315:56–66
Mordvanyuk N, López B, Bifet A (2021) vertTIRP: robust and efficient vertical frequent time interval-related pattern mining, expert systems with applications, 168, 114276.
Park J, Chen MS, Yu PS (1995) An effective hash based algorithm for mining association rules. In: SIGMOD'95, 175-186.
Pavithra A, Dhanaraj S (2018) Comparative study of effective performance of association rule Mining in Different Databases. Data Mining Knowl Eng 10(4):74–77
Roberge V, Tarbouchi M, Okou FA (2017) Distribution system optimization on graphics processing unit. IEEE Trans Smart Grid 8(4):1689–1699
Singh AP, Singh DP (2015) Implementation of K-shortest path algorithm in GPU using CUDA. Procedia Comp Sci 48:5–13
Sohrabi MK (2018) A gossip-based information fusion protocol for distributed frequent Itemset mining, Enterprise Inform Syst, 12(6), 674-694.
Sohrabi MK, Barforoush AA (2013) Parallel frequent itemset mining using systolic arrays. Knowl-Based Syst 37:462–471
Sohrabi MK, Ghods V (2014) Top-down vertical itemset mining. In sixth international conference on graphic and image processing (ICGIP 2014), 94431V-94431V7.
Sohrabi MK, Ghods V (2016) CUSE: a novel cube-based approach for sequential pattern mining. In 4th international symposium on computational and business intelligence (ISCBI), 186–190.
Sohrabi MK, Taheri N (2018) A haoop-based parallel mining of frequent itemsets using N-lists. J Chin Inst Eng 41(1):229–238
Tiwary A, Mayank, Sahoo AK, and Misra R (2014) "Efficient implementation of apriori algorithm on HDFS using GPU." In 2014 International Conference on High Performance Computing and Applications (ICHPCA), pp. 1–7. IEEE
Toivonen H (1996) Sampling large databases for association rules. In: proceeding of the 1996 international conference on very large data bases (VLDB’96), 134–145.
Zhang F, Zhang Y, Bakos JD. GPApriori: GPU-accelerated frequent itemset mining. Proceed CLUSTER (2011), pp. 590–594.
Zoraghchian AA, Sohrabi MK, FarzinYaghmaee (2021) Exploiting parallel graphics processing units to improve association rule mining in transactional databases using butterfly optimization algorithm. Cluster Comput 24(4):3767–3778
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
There is no funding for this paper.
Conflict of interest
Authors declare that they have no conflicts of interests.
Competing interests
Authors declare that they have no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zoraghchian, A.A., Sohrabi, M.K. & Yaghmaee, F. Parallel frequent itemsets mining using distributed graphic processing units. Multimed Tools Appl 81, 43873–43895 (2022). https://doi.org/10.1007/s11042-022-13225-z
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-13225-z