Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

High utility itemsets mining from transactional databases: a survey

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Mining high utility itemsets are the basic task in the area of frequent itemset mining (FIM) that has various applications in diverse domains, including market basket analysis, web mining, cross-marketing, and e-commerce. In recent years, many efficient high utility itemsets mining (HUIM) algorithms are proposed to discover the high utility itemsets (HUIs). This survey presents a comprehensive summary of the current state-of-the-art HUIM approaches for transactional databases. This paper categorises the state-of-the-art approaches as level-wise, tree-based, utility-list-based, projection-based and miscellaneous. It provides the pros and cons of each category of mining approaches in detail. A taxonomy of the HUIM for transactional databases is presented. The survey also summarises and discusses approaches for other types of databases, including on-shelf, dynamic and uncertain. The paper explores the applications of HUIM in diverse domains and discusses the challenges and limitations of the approach. It presents an overview of 16 real-world which are utilized by various state-of-the-art HUIM approaches for transactional databases. Overall, this survey provides a valuable resource for researchers in the field of HUIM and offers insights into future directions for research and development in this area.

Graphical Abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availability and access

Our work does not have any data to explore. Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.

Notes

  1. A k-itemset is HUIs only if all of its sub-itemsets are HUIs.

  2. If a set fails to pass a test, all the supersets will also fail the same test as well.

  3. CHUI is a group of itemsets that have no supersets with the same support count in the database.

  4. All the prefix extensions of an itemset with relevant items, are high utility itemsets without creating the rest of its subtree.

  5. An itemset is high utility itemset and all its proper subsets are not, without creating the rest of its subtrees.

  6. The coverage is used to prune low utility itemsets and to quickly calculates the closure of itemsets.

  7. Lattice is an effective approach for data analysis and knowledge discovery to mine the association rules.

  8. The temporal HUIs are the itemsets whose support is larger than a pre-specified threshold in current time window of the data stream.

References

  1. Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In Proceedings of the 20th International Conference on Very Large Data Bases, VLDB ’94, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc, p 487–499

  2. Fournier-Viger P, Lin JC-W, Kiran RU, Koh YS (2017) A survey of sequential pattern mining. Data Science and Pattern Recognition 1(1):54–77

    Google Scholar 

  3. Yao H, Hamilton HJ, Geng L (2006) A unified framework for utility-based measures for mining itemsets. In Proc. of ACM SIGKDD 2nd Workshop on Utility-Based Data Mining, p 28–37. Citeseer

  4. Liu J, Wang K, Fung BCM (2012) Direct discovery of high utility itemsets without candidate generation. 2012 IEEE 12th International Conference on Data Mining. Belgium, Brussels, pp 984–989

    Google Scholar 

  5. Gan W, Lin JC-W, Fournier-Viger P, Chao H-C, Fujita H (2018) Extracting non-redundant correlated purchase behaviors by utility measure. Knowl-Based Syst 143:30–41

    Google Scholar 

  6. Shie B-E, Tseng VS, Yu PS (2010) Online mining of temporal maximal utility itemsets from data streams. SAC ’10, New York, NY, USA. Association for Computing Machinery, p 1622–1626

  7. Tamilselvi T, Arasu GT (2019) Handling high web access utility mining using intelligent hybrid hill climbing algorithm based tree construction. Clust Comput 22(Suppl 1):145–155

    Google Scholar 

  8. Li Y-C, Yeh J-S, Chang C-C (2008) Isolated items discarding strategy for discovering high utility itemsets. Data Knowl Eng 64(1):198–217

    Google Scholar 

  9. Zihayat M, Davoudi H, An A (2017) Mining significant high utility gene regulation sequential patterns. BMC Syst Biol 11(6):109

    Google Scholar 

  10. Shie B-E, Hsiao H-F, Tseng VS (2013) Efficient algorithms for discovering high utility user behavior patterns in mobile commerce environments. Knowl Inf Syst 37(2):363–387

  11. Li H-F, Huang H-Y, Chen Y-C, Liu Y-J, Lee S-Y (2008) Fast and memory efficient mining of high utility itemsets in data streams. In 2008 eighth IEEE international conference on data mining. p 881–886. IEEE

  12. Liu Y, Liao W-k, Choudhary A (2005) A two-phase algorithm for fast discovery of high utility itemsets. In Proceedings of the 9th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD’05, Berlin, Heidelberg. Springer-Verlag, p 689–695

  13. Lan G-C, Hong T-P, Tseng VS (2014) An efficient projection-based indexing approach for mining high utility itemsets. Knowl Inf Syst 38(1):85–107

    Google Scholar 

  14. Tseng VS, Wu C-W, Shie B-E, Yu PS (2010) UP-Growth: An Efficient Algorithm for High Utility Itemset Mining. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’10, New York, NY, USA. Association for Computing Machinery, p 253–262

  15. Tseng VS, Shie B-E, Wu C-W, Yu PS (2013) Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans Knowl Data Eng 25(8):1772–1786

    Google Scholar 

  16. Liu J, Wang K, Fung BCM (2016) Mining high utility patterns in one phase without generating candidates. IEEE Trans Knowl Data Eng 28(5):1245–1257

    Google Scholar 

  17. Fournier-Viger P, Wu C-W, Zida S, Tseng VS (2014) FHM: Faster High-Utility Itemset Mining Using Estimated Utility Co-occurrence Pruning. In Andreasen T, Christiansen H, Cubero J-C, Raś ZW (eds) Foundations of Intelligent Systems. Springer International Publishing, p 83–92. Cham

  18. Krishnamoorthy S (2017) Hminer: Efficiently mining high utility itemsets. Expert Syst Appl 90:168–183

    Google Scholar 

  19. Jaysawal BP, Huang J-W (2019) DMHUPS: discovering multiple high utility patterns simultaneously. Knowl Inf Syst 59(2):337–359

    Google Scholar 

  20. Tseng VS, Wu C, Fournier-Viger P, Yu PS (2015) Efficient algorithms for mining the concise and lossless representation of high utility itemsets. IEEE Trans Knowl Data Eng 27(03):726–739

  21. Fournier-Viger P, Lin JC-W, Nkambou R, Vo B, Tseng VS (2019) High-Utility Pattern Mining: Theory, Algorithms and Applications, vol 51 of Studies in Big Data. Springer

  22. Zhang C, Han M, Sun R, Du S, Shen M (2020) A survey of key technologies for high utility patterns mining. IEEE Access 8:55798–55814

    Google Scholar 

  23. Rahmati B, Sohrabi MK (2019) A systematic survey on high utility itemset mining. Int J Inf Technol Decis Mak 18(04):1113–1185

    Google Scholar 

  24. Suvarna U, Srinivas Y (2019) Efficient High-Utility Itemset Mining Over Variety of Databases: A Survey. p 803–816

  25. Singh K, Kumar R, Biswas B (2021) High average-utility itemsets mining: a survey. Appl Intell. p 1–38

  26. Gan W, Chun-Wei J, Chao H-C, Wang S-L, Yu PS (2018) Privacy preserving utility mining: A survey. 2018 IEEE International Conference on Big Data (Big Data)

  27. Gan W, Lin JC-W, Fournier-Viger P, Chao H-C, Hong T-P, Fujita T-P (2018) A survey of incremental high-utility itemset mining. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 8(2):e1242

    Google Scholar 

  28. Gan W, Lin JC-W, Fournier-Viger P, Chao H-C, Tseng VS, Yu PS (2021) A survey of utility-oriented pattern mining. IEEE Trans Knowl Data Eng 33(4):1306–1327

    Google Scholar 

  29. Han M, Gao Z, Li A, Liu S, Mu D (2022) An overview of high utility itemsets mining methods based on intelligent optimization algorithms. Knowl Inf Syst 64(11):2945–2984

    Google Scholar 

  30. Singh K, Singh SS, Kumar A, Biswas B (2018) High utility itemsets mining with negative utility value: A survey. Journal of Intelligent and Fuzzy Systems 35(6):6551–6562

    Google Scholar 

  31. Lin JC-W, Gan W, Fournier-Viger P, Hong T-P, Zhan J (2016) Efficient mining of high-utility itemsets using multiple minimum utility thresholds. Know-Based Syst 113(C):100–115

  32. Ahmed CF, Tanbeer SK, Jeong B-S, Lee Y-K (2009) Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans Knowl Data Eng 21(12):1708–1721

    Google Scholar 

  33. Lan G-C, Hong T-P, Tseng VS (2011) Discovery of high utility itemsets from on-shelf time periods of products. Expert Syst Appl 38(5):5851–5857

    Google Scholar 

  34. Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, SIGMOD ’93, New York, NY, USA. Association for Computing Machinery, p 207–216

  35. Wu C-W, Fournier-Viger P, Yu PS, Tseng VS (2011) Efficient mining of a concise and lossless representation of high utility itemsets. In Cook DJ, Pei J, Wang W, Zaïane OR, Wu X (eds) 11th IEEE International Conference on Data Mining, ICDM 2011, Vancouver, BC, Canada, December 11-14, 2011. IEEE Computer Society, p 824–833

  36. Singh K, Singh SS, Luhach AK, Kumar A, Biswas B (2021) Mining of closed high utility itemsets: A survey 14(1):6–12

  37. Lin JC-W, Gan W, Fournier-Viger P, Hong T-P, Tseng VS (2016) Fast algorithms for mining high-utility itemsets with various discount strategies. Adv Eng Inform 30(2):109–126

    Google Scholar 

  38. Zihayat M, An A (2014) Mining top-k high utility patterns over data streams. Inf Sci 285(C):138–161

  39. Ahmed CF, Tanbeer SK, Jeong B-S (2011) A framework for mining high utility web access sequences. IETE Tech Rev 28(1):3–16

    Google Scholar 

  40. Ahmed CF, Tanbeer SK, Jeong B-S, Lee Y-K (2009) Efficient mining of utility-based web path traversal patterns. In Proceedings of the 11th International Conference on Advanced Communication Technology - Volume 3, ICACT’09. pp 2215–2218. IEEE Press

  41. Li Y-C, Yeh J-S, Chang C-C (2005) Direct candidates generation: A novel algorithm for discovering complete share-frequent itemsets. In Proceedings of the Second International Conference on Fuzzy Systems and Knowledge Discovery - Volume Part II, FSKD’05. Berlin, Heidelberg, pp 551–560. Springer-Verlag

  42. Erwin A, Gopalan R, Achuthan N (2008) Efficient mining of high utility itemsets from large datasets. p 554–561

  43. Liu Y-C, Cheng C-P, Tseng VS (2013) Mining differential top-k co-expression patterns from time course comparative gene expression datasets. BMC Bioinforma 14(1):230. Article number: 230

  44. Krishnamoorthy S, Roy D (2020) An utility-based storage assignment strategy for e-commerce warehouse management. In 2019 International Conference on Data Mining Workshops (ICDMW), p 997–1004. IEEE

  45. Yun C-H, Chen M-S (2007) Mining mobile sequential patterns in a mobile commerce environment. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 37(2):278–295

  46. Yao H, Hamilton HJ (2006) Mining itemset utilities from transaction databases. Data and Knowledge Engineering 59(3):603–626

    Google Scholar 

  47. Erwin A, Gopalan RP, Achuthan NR (2007) CTU-Mine: An efficient high utility itemset mining algorithm using the pattern growth approach. In 7th IEEE International Conference on Computer and Information Technology (CIT 2007). p 71–76. IEEE

  48. Ahmed CF, Tanbeer SK, Jeong B-S, Lee Y-K (2011) HUC-Prune: an efficient candidate pruning technique to mine high utility patterns. Appl Intell 34:181–198

  49. Lin M-Y, Tu T-F, Hsueh S-C (2012) High utility pattern mining using the maximal itemset property and lexicographic tree structures. Inf Sci 215:1–14

    Google Scholar 

  50. Song W, Liu Y, Li J (2014) Mining high utility itemsets by dynamically pruning the tree structure. Appl Intell 40(1):29–43

    Google Scholar 

  51. Zida S, Fournier-Viger P, Lin Jerry C-W, Wu C-W, Tseng VS (2017) EFIM: A fast and memory efficient algorithm for high-utility itemset mining. Knowl Inf Syst 51(2):595–625

    Google Scholar 

  52. Chan R, Yang Q, Shen Y-D (2003) Mining high utility itemsets. In Third IEEE international conference on data mining. p 19–26. IEEE Computer Society

  53. Yao H, Hamilton HJ, Butz CJ (2004) A Foundational Approach to Mining Itemset Utilities from Databases, vol 4. p 482–486

  54. Liu Y, Liao W-k, Choudhary A (2005) A fast high utility itemsets mining algorithm. In Proceedings of the 1st International Workshop on Utility-Based Data Mining, UBDM ’05. New York, NY, USA, p 90–99. Association for Computing Machinery

  55. Lee D, Park S-H, Moon S (2013) Utility-based association rule mining: A marketing solution for cross-selling. Expert Syst Appl 40(7):2715–2725

    Google Scholar 

  56. Barber B, Hamilton HJ (2003) Extracting share frequent itemsets with infrequent subsets. Data Min Knowl Disc 7(2):153–185

    MathSciNet  Google Scholar 

  57. Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. SIGMOD Rec 29(2):1–12

    Google Scholar 

  58. Hu J, Mojsilovic A (2007) High-utility pattern mining: A method for discovery of high-utility item sets. Pattern Recogn 40(11):3317–3324

    MATH  Google Scholar 

  59. Yun U, Ryang H, Ryu KH (2014) High utility itemset mining with techniques for reducing overestimated utilities and pruning candidates. Expert Syst Appl 41(8):3861–3878

    Google Scholar 

  60. Krishnamoorthy S (2015) Pruning strategies for mining high utility itemsets. Expert Systems with Applications 42(5):2371–2381

  61. Zida S, Fournier-Viger P, Lin JC-W, Wu C-W, Tseng VS (2015) EFIM: A Highly Efficient Algorithm for High-Utility Itemset Mining. Springer International Publishing, Cham, pp 530–546

  62. Shao J, Yin J, Liu W, Cao L (2015) Mining actionable combined patterns of high utility and frequency. In 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA). p 1–10. IEEE

  63. Dawar S, Goyal V (2015) UP-Hist Tree: An Efficient Data Structure for Mining High Utility Patterns from Transaction Databases. In Proceedings of the 19th International Database Engineering & Applications Symposium, IDEAS ’15. New York, NY, USA, p 56–61. Association for Computing Machinery

  64. Fournier-Viger P, Zida S, Lin JC-W, Wu C-W, Tseng VS (2016) EFIM-Closed: Fast and Memory Efficient Discovery of Closed High-Utility Itemsets. Springer International Publishing, Cham, pp 199–213

    Google Scholar 

  65. Ryang H, Yun U, Ryu KH (2016) Fast algorithm for high utility pattern mining with the sum of item quantities. Intelligent Data Analysis 20(2):395–415

    Google Scholar 

  66. Qu J-F, Liu M, Xin C, Wu Z (2018) Fast identification of high utility itemsets from candidates. Information 9(5)

  67. Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Min Knowl Disc 8(1):53–87

    MathSciNet  Google Scholar 

  68. Singh K, Biswas B (2019) Efficient algorithm for mining high utility pattern considering length constraints. Int J Data Warehous Min 15(3):1–27

    Google Scholar 

  69. Fournier-Viger P, Lin JC-W, Duong Q-H, Dam T-L (2016) FHM+: Faster High-Utility Itemset Mining Using Length Upper-Bound Reduction. In Trends in Applied Knowledge-Based Systems and Data Science: 29th International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2016, Morioka, Japan, August 2-4, 2016, Proceedings, p 115–127, Cham. Springer International Publishing

  70. Dawar S, Goyal V, Bera D (2019) A one-phase tree-based algorithm for mining high-utility itemsets from a transaction database. CoRR. arXiv:1911.07151

  71. Wu JM-T, Lin JC-W, Tamrakar A (2019) High-utility itemset mining with effective pruning strategies. ACM Trans Knowl Discov Data 13(6)

  72. Fournier-Viger P, Zhang Y, Lin JC-W, Dinh D-T, Le HB (2020) Mining correlated high-utility itemsets using various measures. Logic Journal of the IGPL 28(1):19–32

    MathSciNet  MATH  Google Scholar 

  73. Hoa NT, Tao NV (2021) A novel fast algorithm for mining compact high utility itemsets. International Conference on Intelligent Systems Design and Applications. Springer International Publishing, Cham, pp 1325–1335

    Google Scholar 

  74. Nguyen LTT, Vu VV, Lam MTH, Duong TTM, Manh LT, Nguyen TTT, Vo B, Fujita H (2019) An efficient method for mining high utility closed itemsets. Information Sciences 495:78–99

    Google Scholar 

  75. Cheng-Wei Wu, Philippe Fournier-Viger, Jia-Yuan Gu, and Vincent S. Tseng. Mining Compact High Utility Itemsets Without Candidate Generation, pages 279–302. Springer International Publishing, Cham, 2019

  76. Duong H, Hoang T, Tran T, Truong T, Le B, Fournier-Viger P (2022) Efficient algorithms for mining closed and maximal high utility itemsets. Knowl-Based Syst 257:109921

    Google Scholar 

  77. Deng Z-H (2018) An efficient structure for fast mining high utility itemsets. Appl Intell 48(9):3161–3177

    Google Scholar 

  78. Wu C-W, Fournier-Viger P, Gu J-Y, Tseng VS (2016) Mining closed+ high utility itemsets without candidate generation. In 2015 conference on technologies and applications of artificial intelligence (TAAI), p 187–194. IEEE

  79. Erwin A, Gopalan RP, Achuthan NR (2007) A bottom-up projection based algorithm for mining high utility itemsets. In Proceedings of the 2nd International Workshop on Integrating Artificial Intelligence and Data Mining - Volume 84, AIDM ’07, AUS. Australian Computer Society, Inc, p 3–11

  80. Le B, Nguyen H, Cao TA, Vo B (2009) A novel algorithm for mining high utility itemsets. In 2009 First Asian Conference on Intelligent Information and Database Systems, p 13–17. IEEE

  81. Liu M, Qu J (2012) Mining high utility itemsets without candidate generation. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management, CIKM ’12, New York, NY, USA, p 55–64. Association for Computing Machinery

  82. Peng AY, Koh YS, Riddle P (2017) mHUIMiner: A Fast High Utility Itemset Mining Algorithm for Sparse Datasets. In: Kim J, Shim K, Cao L, Lee J-G, Lin X, Moon Y-S (eds) Advances in Knowledge Discovery and Data Mining. Springer International Publishing, Cham, pp 196–207

    Google Scholar 

  83. Dawar S, Goyal V, Bera D (2017) A hybrid framework for mining high-utility itemsets in a sparse transaction database. Appl Intell 47(3):809–827

    Google Scholar 

  84. Dam T-L, Li K, Fournier-Viger P, Duong Q-H (2019) CLS-Miner: efficient and effective closed high-utility itemset mining. Frontiers of Computer Science 13(2):357–381

  85. Sahoo J, Das AK, Goswami A (2014) An algorithm for mining high utility closed itemsets and generators. CoRR, arXiv:1410.2988

  86. Ryang H, Yun U (2017) Indexed list-based high utility pattern mining with utility upper-bound reduction and pattern combination techniques. Knowl Inf Syst 51(2):627–659

    Google Scholar 

  87. Fournier-Viger P, Lin JC-W, Dinh T, Le HB (2016) Mining correlated high-utility itemsets using the bond measure. In International Conference on Hybrid Artificial Intelligence Systems, p 53–65. Springer

  88. Bouasker S, Yahia SB (2015) Key correlation mining by simultaneous monotone and anti-monotone constraints checking. In Proceedings of the 30th Annual ACM Symposium on Applied Computing, SAC ’15, New York, NY, USA, p 851–856. Association for Computing Machinery

  89. Ramampiaro H, Nørvåg K, Duong Q-H, Fournier-Viger P, Dam T-L (2018) Efficient high utility itemset mining using buffered utility-lists. Appl Intell 48:1859–1877

    Google Scholar 

  90. Shie B-E, Philip SY, Tseng VS (2012) Efficient algorithms for mining maximal high utility itemsets from data streams with different models. Expert Syst Appl 39(17):12947–12960

    Google Scholar 

  91. Nguyen TDD, Vu Q-B, Nguyen LTT (2019) Efficient algorithms for mining maximal high-utility itemsets. In 2019 6th NAFOSTED Conference on Information and Computer Science (NICS), p 428–433. IEEE

  92. Merugula S, Rao MVP (2020) An integrated approach for mining closed and generator high utility itemsets. International Journal of Knowledge-based and Intelligent Engineering Systems 24(1):27–35

    Google Scholar 

  93. Wei T, Wang B, Zhang Y, Hu K, Yao Y, Liu H (2020) FCHUIM: efficient frequent and closed high-utility itemsets mining. IEEE Access 8:109928–109939

    Google Scholar 

  94. Tseng VS, Wu C, Fournier-Viger P, Yu PS (2016) Efficient algorithms for mining top-k high utility itemsets. IEEE Trans Knowl Data Eng 28(01):54–67

    Google Scholar 

  95. Shen W, Zhang C, Fang W, Zhang X, Zhan Z-H, Lin JC-W (2021) Efficient high-utility itemset mining based on a novel data structure. In 2021 IEEE International Smart Cities Conference (ISC2), p 1–6. IEEE

  96. Wu P, Niu X, Fournier-Viger P, Huang C, Wang B (2022) UBP-Miner: An efficient bit based high utility itemset mining algorithm. Knowl-Based Syst 248:108865

    Google Scholar 

  97. Cheng Z, Fang W, Shen W, Lin JC-W, Yuan B (2022) An efficient utility-list based high-utility itemset mining algorithm. Appl Intell

  98. Lan G-C, Hong T-P, Tseng VS et al (2012) A projection-based approach for discovering high average-utility itemsets. J Inf Sci Eng 28(1):193–209

    Google Scholar 

  99. Duong Q-H, Liao B, Fournier-Viger P, Dam T-L (2016) An efficient algorithm for mining the top-k high utility itemsets, using novel threshold raising and pruning strategies. Knowl-Based Syst 104:106–122

    Google Scholar 

  100. Lin JC-W, Gan W, Fournier-Viger P, Hong T-P, Tseng VS (2016) Efficient algorithms for mining high-utility itemsets in uncertain databases. Know-Based Syst 96(C):171–187

  101. Wang J-Z, Huang J-L, Chen Y-C (2016) On efficiently mining high utility sequential patterns. Knowl Inf Syst 49(2):597-627

    Google Scholar 

  102. Dam T-L, Li K, Fournier-Viger P, Duong Q-H (2017) An efficient algorithm for mining top-k on-shelf high utility itemsets. Knowl Inf Syst 52(3):621–655

    Google Scholar 

  103. Fournier-Viger P, Lin JC-W, Duong Q-H, Dam T-L (2016) PHM: Mining Periodic High-Utility Itemsets. Springer International Publishing, Cham, pp 64–79

    Google Scholar 

  104. Bai A, Deshpande PS, Dhabu M (2018) Selective database projections based approach for mining high-utility itemsets. IEEE Access 6:14389–14409

    Google Scholar 

  105. Lan G-C, Hong T-P, Tseng VS (2010) Projection-based utility mining with an efficient indexing mechanism. In 2010 International Conference on Technologies and Applications of Artificial Intelligence, pages 137–141. IEEE

  106. Ahmed CF, Tanbeer SK, Jeong B-S, Choi H-J (2011) A framework for mining interesting high utility patterns with a strong frequency affinity. Inf Sci 181(21):4878–4894

    Google Scholar 

  107. Lin JC-W, Gan W, Fournier-Viger P, Hong T-P, Chao H-C (2017) FDHUP: Fast algorithm for mining discriminative high utility patterns. Knowl Inf Syst 51(3):873–909

    Google Scholar 

  108. Hidouri A, Jabbour S, Raddaoui B, Yaghlane BB (2021) Mining closed high utility itemsets based on propositional satisfiability. Data and Knowledge Engineering 136(C)

  109. Song W, Liu Y, Li J (2014) BAHUI: Fast and memory efficient mining of high utility itemsets based on bitmap. Int J Data Warehous Min 10(1):1–15

    Google Scholar 

  110. Song W, Wang C, Li J (2016) Binary partition for itemsets expansion in mining high utility itemsets. Intelligent Data Analysis 20(4):915–931

    Google Scholar 

  111. Song W, Zhang Z, Li J (2016) A high utility itemset mining algorithm based on subsume index. Knowledge and Information Systems 49(1):315–340

    Google Scholar 

  112. Song W, Yang B, Xu Z (2008) Index-BitTableFI: An improved algorithm for mining frequent itemsets. Knowledge-Based Systems 21(6):507–513

    Google Scholar 

  113. Dahiya V, Dalal S (2022) EAHUIM: Enhanced absolute high utility itemset miner for big data. International Journal of Information Management Data Insights 2(1):100055

    Google Scholar 

  114. Chen Y, An A (2016) Approximate parallel high utility itemset mining. Big Data Research 6:26–42

    Google Scholar 

  115. Lan G-C, Hong T-P, Huang J-P, Tseng VS (2014) On-shelf utility mining with negative item values. Expert Syst Appl 41(7):3450–3459

    Google Scholar 

  116. Fournier-Viger P, Zida S (2015) FOSHU: Faster on-Shelf High Utility Itemset Mining – with or without Negative Unit Profit. In Proceedings of the 30th Annual ACM Symposium on Applied Computing, SAC ’15, New York, NY, USA, pages 857–864. Association for Computing Machinery

  117. Zhang C, Du Z, Yang Y, Gan W, Yu PS (2021) On-shelf utility mining of sequence data. ACM Trans Knowl Discov Data 16(2)

  118. Truong-Chi T, Fournier-Viger P (2019) A Survey of High Utility Sequential Pattern Mining. Springer International Publishing, Cham, pp 97–129

    Google Scholar 

  119. Quang MN, Dinh T, Huynh U, Le B (2016) MHHUSP: An integrated algorithm for mining and Hiding High Utility Sequential Patterns. In 2016 Eighth International Conference on Knowledge and Systems Engineering (KSE), p 13–18

  120. Yin J, Zheng Z, Cao L (2012) USpan: An Efficient Algorithm for Mining High Utility Sequential Patterns. KDD ’12, New York, NY, USA, p 660–668. Association for Computing Machinery

  121. Lin JC-W, Gan W, Fournier-Viger P, Hong T-P, Tseng VS (2017) Efficiently mining uncertain high-utility itemsets. Soft Comput 21(11):2801–2820

    MATH  Google Scholar 

  122. Ahmed U, Lin JC-W, Srivastava G, Yasin R, Djenouri Y (2020) An evolutionary model to mine high expected utility patterns from uncertain databases. IEEE Transactions on Emerging Topics in Computational Intelligence 5(1):19–28

    Google Scholar 

  123. Chu C-J, Tseng VS, Liang T (2008) An efficient algorithm for mining temporal high utility itemsets from data streams. Journal of Systems and Software 81(7):1105–1117

    Google Scholar 

  124. Ryang H, Yun U (2016) High utility pattern mining over data streams with sliding window technique. Expert Syst Appl 57(C):214–231

  125. Lee J, Yun U, Lee G, Yoon E (2018) Efficient incremental high utility pattern mining based on pre-large concept. Eng Appl Artif Intell 72(C):111–123

  126. Lin C-W, Hong T-P, Lan G-C, Wong J-W, Lin W-Y (2014) Incrementally mining high utility patterns based on pre-large concept. Appl Intell 40(2):343–357

    Google Scholar 

  127. Dam T-L, Ramampiaro H, Nørvåg K, Duong Q-H (2019) Towards efficiently mining closed high utility itemsets from incremental databases. Knowl-Based Syst 165:13–29

    Google Scholar 

  128. Nguyen LTT, Nguyen P, Nguyen TDD, Vo B, Fournier-Viger P, Tseng VS (2019) Mining high-utility itemsets in dynamic profit databases. Knowl-Based Syst 175:130–144

    Google Scholar 

  129. Liu J, Ju X, Zhang X, Fung BCM, Yang X, Yu C (2019) Incremental mining of high utility patterns in one phase by absence and legacy-based pruning. IEEE Access 7:74168–74180

    Google Scholar 

  130. Yun U, Nam H, Lee G, Yoon E (2019) Efficient approach for incremental high utility pattern mining with indexed list structure. Futur Gener Comput Syst 95:221–239

    Google Scholar 

  131. Singh K, Kumar A, Singh SS, Shakya HK, Biswas B (2019) EHNL: An efficient algorithm for mining high utility itemsets with negative utility value and length constraints. Inf Sci 484:44–70

    Google Scholar 

  132. Singh K, Shakya HK, Singh A, Biswas B (2018) Mining of high-utility itemsets with negative utility. Expert Syst 35(6):e12296

    Google Scholar 

  133. Hong T-P, Lee C-H, Wang S-L (2011) Effective utility mining with the measure of average utility. Expert Syst Appl 38(7):8259-8265

    Google Scholar 

  134. Truong T, Duong H, Le B, Fournier-Viger P, Yun U (2019) Efficient high average-utility itemset mining using novel vertical weak upper-bounds. Knowl-Based Syst 183:104847

    Google Scholar 

  135. Truong T, Duong H, Le B, Fournier-Viger P (2020) EHAUSM: An efficient algorithm for high average utility sequence mining. Inf Sci 515:302–323

    MathSciNet  Google Scholar 

  136. Wu JM-T, Teng Q, Lin JC-W, Cheng C-F (2020) Incrementally updating the discovered high average-utility patterns with the pre-large concept. IEEE Access 8:66788–66798

    Google Scholar 

  137. Kim J, Yun U, Yoon E, Lin JC-W, Fournier-Viger P (2020) One scan based high average-utility pattern mining in static and dynamic databases. Future Generation Computer Systems 111:143–158

    Google Scholar 

  138. Kannimuthu S, Premalatha K (2014) Discovery of high utility itemsets using genetic algorithm with ranked mutation. Appl Artif Intell 28(4):337–359

    Google Scholar 

  139. Lin JC-W, Yang L, Fournier-Viger P, Hong T-P, Voznak M . A binary PSO approach to mine high-utility itemsets. Soft Comput., 21(17):5103–5121, 2017

  140. Kumar R, Singh K (2022) A survey on soft computing-based high-utility itemsets mining. Soft Comput 26:6347–6392. Springer

  141. Krishnamoorthy S (2018) Efficient mining of high utility itemsets with multiple minimum utility thresholds. Eng Appl Artif Intell 69:112–126

    Google Scholar 

  142. Lin JC-W, Zhang J, Fournier-Viger P, Hong T-P, Zhang J (2017) A two-phase approach to mine short-period high-utility itemsets in transactional databases. Adv Eng Inform 33(C):29–43

  143. Yun U, Kim J (2015) A fast perturbation algorithm using tree structure for privacy preserving utility mining. Expert Syst Appl 42(3):1149–1165

  144. Jr Bayardo RJ, Goethals B, Zaki MJ (eds) (2005) FIMI ’04, Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations, Brighton, UK, November 1, 2004, volume 126 of CEUR Workshop Proceedings. CEUR-WS.org

  145. Dua D, Graff C (2017) UCI machine learning repository. Center for Machine Learning and Intelligent Systems

  146. Ramanathan YL, Liao N-W-k, Memik G, Ozisikyilmaz B, Pisharath J, Choudhary A (2008) NU-MineBench version 2.0 source code and datasets. Center for Ultra-scale Computing and Information Security (CUCIS)

  147. Zheng Z, Kohavi R, Mason L (2001) Real world performance of association rule algorithms. In Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’01, New York, NY, USA, p 401–406. Association for Computing Machinery

  148. Fournier-Viger P, Lin JC-W, Gomariz A, Gueniche T, Soltani A, Deng Z, Lam HT (2016) The SPMF open-source data mining library version 2. In: Berendt B, Bringmann B, Fromont É, Garriga G, Miettinen P, Tatti N, Tresp V (eds) Machine Learning and Knowledge Discovery in Databases. Springer International Publishing, Cham, pp 36–40

    Google Scholar 

  149. Pei J, Han J (2002) Constrained frequent pattern mining: A pattern-growth view. SIGKDD Explor Newsl 4(1):31–39

    Google Scholar 

  150. Lin YC, Wu C-W, Tseng VS (2015) Mining high utility itemsets in big data. In: Cao T, Lim E-P, Zhou Z-H, Ho T-B, Cheung D, Motoda H (eds) Advances in Knowledge Discovery and Data Mining. Springer International Publishing, Cham, pp 649–661

    Google Scholar 

  151. Martn D, Martnez-Ballesteros M, Garca-Gil D, Alcal-Fdez J, Herrera F, Riquelme-Santos JC (2018) Mrqar. Know-Based Syst 153(C):176–192

  152. Zhai J, Zhang S, Zhang M, Liu X (2018) Fuzzy integral-based elm ensemble for imbalanced big data classification. Soft Comput 22(11):3519–3531

    Google Scholar 

  153. Zhai J, Wang X, Pang X (2016) Voting-based instance selection from large data sets with mapreduce and random weight networks. Inf Sci 367(C):1066–1077

  154. Ramírez-Gallego S, Krawczyk B, García S, Woźniak M, Herrera F (2017) A survey on data preprocessing for data stream mining: Current status and future directions. Neurocomputing 239:39–57

    Google Scholar 

  155. Dong G, Zhang X, Wong L, Li J (1999) CAEP: Classification by aggregating emerging patterns. In International Conference on Discovery Science, vol 1721 of Lecture Notes in Computer Science. pp 30–42. Springer

  156. Lee SD, Cheung DW, Kao B (1998) Is sampling useful in data mining? a case in the maintenance of discovered association rules. Data Min Knowl Disc 2(3):233–262

  157. Sahoo J, Das AK, Goswami A (2015) An efficient approach for mining association rules from high utility itemsets. Expert Syst Appl 42(13):5754–5778

    Google Scholar 

  158. Lin C-W, Hong T-P, Lan G-C, Wong J-W, Lin W-Y (2015) Efficient updating of discovered high-utility itemsets for transaction deletion in dynamic databases. Adv Eng Inform 29(1):16–27

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

Author Contributions K.S. conceived analysis idea and presented taxonomy. R.K. developed the theory and analysis of the state-of-the-art approaches. R.K. provided the data for Tables 7 to 17. K.S. implemented the example and tables related to the running example. R.K. performed the analysis part at Section 3. K.S. collected all the required data. K.S. Reviewed and Edited the manuscript. Both the authors analyzed and contributed to the final manuscript.

Corresponding author

Correspondence to Kuldeep Singh.

Ethics declarations

Competing Interests

The authors declare no conflicts of interest. The article presents a survey of high utility itemsets mining approaches for transactional databases. No any Funding is received for this work.

Ethical and informed consent for data used

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kumar, R., Singh, K. High utility itemsets mining from transactional databases: a survey. Appl Intell 53, 27655–27703 (2023). https://doi.org/10.1007/s10489-023-04853-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-023-04853-5

Keywords