Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

An efficient approach for incremental erasable utility pattern mining from non-binary data

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

There are many real-life data incrementally generated around the world. One of the recent interesting issues is the efficient processing real-world data that is continuously accumulated. Mining and recognizing removable patterns in such data is a challenging task. Erasable pattern mining confronts this challenge by discovering removable patterns with low gain. In various real-world applications, data are stored in the form of non-binary databases. These databases store item information in a quantity form. Since items in the database can each have different characteristics, such as quantities, considering their relative features makes the mined patterns more meaningful. For these reasons, we propose an erasable utility pattern mining algorithm for incremental non-binary databases. The suggested technique can recognize removable patterns by considering the relative utility of items and the profit of products in an incremental database. The proposed algorithm utilizes a list structure for efficiently extracting erasable utility patterns. Several experiments have been conducted to compare the performance between the suggested algorithm and state-of-the-art techniques using real and synthetic datasets, and the results demonstrate the effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25

Similar content being viewed by others

References

  1. Ahmed CF, Tanbeer SK, Jeong B, Lee Y (2009) Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans Knowl Data Eng 21(12):1708–1721

    Article  Google Scholar 

  2. Baek Y, Yun U, Kim H, Nam H, Lee G, Yoon E, Vo B, Lin JC-W (2020) Erasable pattern mining based on tree structures with damped window over data streams. Eng Appl Artif Intell 94:103735

    Article  Google Scholar 

  3. Baek Y, Yun U, Lin JC-W, Yoon E, Fujita H (2020) Efficiently mining erasable stream patterns for intelligent systems over uncertain data. Int J Intell Syst 35(11):1699–1734

    Article  Google Scholar 

  4. Choi H-J, Park CH (2019) Emerging topic detection in twitter stream based on high utility pattern mining. Expert Syst Appl 115:27–36

    Article  Google Scholar 

  5. Deng Z-H and Xu X (2010) An efficient algorithm for mining erasable itemsets. Advanced data mining and applications: 6th International Conference, vol. 1, pp. 214-225

  6. Deng Z-H, Xu X (2012) Fast mining erasable itemsets using NC_sets. Expert Syst Appl 39(4):4453–4463

    Article  Google Scholar 

  7. Ding W, Lin C-T, Liew AW-C, Triguero I, Luo W (2020) Current trends of granular data mining for biomedical data analysis. Inf Sci 510:341–343

    Article  Google Scholar 

  8. Fasihy H, Shahraki MHN (2018) Incremental mining maximal frequent patterns from univariate uncertain data. Knowl-Based Syst 152:40–50

    Article  Google Scholar 

  9. Fouad MA, Hussein W, Rady S, Yu PS, Gharib TF (2022) An efficient approach for mining reliable high utility patterns. IEEE Access 10:1419–1431

    Article  Google Scholar 

  10. Gan W, Lin JC-W, Zhang J, Chao H-C, Fujita H, Yu PS (2020) ProUM: projection-based utility mining on sequence data. Inf Sci 513:222–240. https://doi.org/10.1016/j.ins.2019.10.033

    Article  Google Scholar 

  11. Hidouri A, Jabbour S, Raddaoui B, Yaghlane BB (2021) Mining closed high utility itemsets based on propositional satisfiability. Data Knowl Eng 136:101927

    Article  Google Scholar 

  12. Hong T-P, Lee C-H and Wang S-L (2009) Mining high average-utility itemsets. 2009 IEEE International Conference on Systems, Man and Cybernetics, pp. 2526–2530

  13. Hong T-P, Huang W-M, Lan G-C, Chiang M-C, Lin JC-W (2021) A bitmap approach for mining erasable itemsets. IEEE Access 9:106029–106038

    Article  Google Scholar 

  14. Hong T-P, Chang H, Li S-M, and Tsai Y-C (2021) A dedicated temporal erasable-itemset mining algorithm. International conference on intelligent systems design and applications, pp. 977–985

  15. Huynh HM, Nguyen LTT, Vo B, Nguyen A, Tseng VS (2020) Efficient methods for mining weighted clickstream patterns. Expert Syst Appl 142:112993

    Article  Google Scholar 

  16. Kim H, Ryu T, Lee C, Kim H, Yoon E, Vo B, Lin JC-W, Yun U (2022) HMIN: efficient approach of list based high-utility pattern mining with negative unit profits. Expert Syst Appl 209:118214

    Article  Google Scholar 

  17. Kim H, Yun U, Baek Y, Kim H, Nam H, Lin JC-W, Fournier-Viger P (2021) Damped sliding based utility oriented pattern mining over stream data. Knowl-Based Syst 213:106653

    Article  Google Scholar 

  18. Krishnamoorthy S (2019) Mining top-k high utility itemsets with effective threshold raising strategies. Expert Syst Appl 117:148–165

    Article  Google Scholar 

  19. Le T, Vo B and Coenen F (2013) An efficient algorithm for mining erasable itemsets using the difference of NC-sets. 2013 IEEE International conference on systems, man, and cybernetics, pp. 2270–2274

  20. Le T, Vo B (2014) MEI: an efficient algorithm for mining erasable itemsets. Eng Appl Artif Intell 27:155–166

    Article  Google Scholar 

  21. Le T, Vo B, Fournier-Viger P, Lee MY, Baik SW (2019) SPPC: a new tree structure for mining erasable patterns in data streams. Appl Intell 49(2):478–495

    Article  Google Scholar 

  22. Lee C, Baek Y, Ryu T, Kim H, Kim H, Lin JC-W, Vo B, Yun U (2022) An efficient approach for mining maximized erasable utility patterns. Inf Sci 609:1288–1308

    Article  Google Scholar 

  23. Lee G, Yun U, Ryang H, Kim D (2016) Erasable itemset mining over incremental databases with weight conditions. Eng Appl Artif Intell 52:213–234

    Article  Google Scholar 

  24. Lee G, Yun U (2018) Single-pass based efficient erasable pattern mining using list data structure on dynamic incremental databases. Futur Gener Comput Syst 80:12–28

    Article  Google Scholar 

  25. Lee G, Yun U, Ryang H (2015) Mining weighted erasable patterns by using underestimated constraint-based pruning technique. J Intell Fuzzy Syst 28(3):1145–1157

    Article  Google Scholar 

  26. Lin JC-W, Li T, Pirouz M, Zhang J, Fournier-Viger P (2020) High average-utility sequential pattern mining based on uncertain databases. Knowl Inf Syst 62(3):1199–1228

    Article  Google Scholar 

  27. Lin JC-W, Djenouri Y, Srivastava G, Li Y, Yu PS (2022) Scalable mining of high-utility sequential patterns with three-tier mapreduce model. ACM Trans Knowl Discov Data 16(3):1–26. https://doi.org/10.1145/3487046

    Article  Google Scholar 

  28. Lin JC-W, Djenouri Y, Srivastava G, Yun U, Fournier-Viger P (2021) A predictive GA-based model for closed high-utility itemset mining. Appl Soft Comput 108:107422

    Article  Google Scholar 

  29. Liu Y, Liao W and Choudhary A (2005) A two-phase algorithm for fast discovery of high utility itemsets, Adv Knowl Discover Data Mining, pp. 689-695

  30. Ma J, Zhang Y, Zhang L, Du B, Tao D (2019) Pseudo supervised matrix factorization in discriminative subspace. IJCAI 2019:4554–4560

    Google Scholar 

  31. Nam H, Yun U, Yoon E, Lin JC-W (2020) Efficient approach for incremental weighted erasable pattern mining with list structure. Expert Syst Appl 143:113087

    Article  Google Scholar 

  32. Nguyen H, Le T, Nguyen M, Fournier-Viger P, Tseng VS, Vo B (2022) Mining frequent weighted utility itemsets in hierarchical quantitative databases. Knowl-Based Syst 237:107709

    Article  Google Scholar 

  33. Nguyen L, Nguyen G, Le B (2019) Fast algorithms for mining maximal erasable patterns. Expert Syst Appl 124:50–66

    Article  Google Scholar 

  34. Ryu T, Yun U, Lee C, Lin JC-W, Pedrycz W (2022) Occupancy-based utility pattern mining in dynamic environments of intelligent systems. Int J Intell Syst 37(9):5477–5507

    Article  Google Scholar 

  35. Simsek S, Kursuncu U, Kibis E, AnisAbdellatif M, Dag A (2020) A hybrid data mining approach for identifying the temporal effects of variables associated with breast cancer survival. Expert Syst Appl 139:112863

    Article  Google Scholar 

  36. Truong T, Duong H, Le B, Fournier-Viger P (2019) FMaxCloHUSM: An efficient algorithm for mining frequent closed and maximal high utility sequences. Eng Appl Artif Intell 85:1–20

    Article  Google Scholar 

  37. Truong T, Duong H, Le B, Fournier-Viger P (2020) EHAUSM: An efficient algorithm for high average utility sequence mining. Inf Sci 515:302–323

    Article  MathSciNet  Google Scholar 

  38. Tung NT, Nguyen LTT, Nguyen TDD, Fournier-Viger P, Nguyen N-T, Vo B (2022) Efficient mining of cross-level high-utility itemsets in taxonomy quantitative databases. Inf Sci 587:41–62

    Article  Google Scholar 

  39. Wang Z, Du B, Tu W, Zhang L, Tao D (2019) Incorporating Distribution Matching into Uncertainty for Multiple Kernel Active Learning. IEEE Trans Knowl Data Eng 33(1):128–142

    Article  Google Scholar 

  40. Xu X, Yin X, Chen X (2019) A large-group emergency risk decision method based on data mining of public attribute preferences. Knowl-Based Syst 163:495–509

    Article  Google Scholar 

  41. Yun U, Ryang H, Lee G, Fujita H (2017) An efficient algorithm for mining high utility patterns from incremental databases with one database scan. Knowl-Based Syst 124:188–206

    Article  Google Scholar 

  42. Yun U, Kim D (2017) Mining of high average-utility itemsets using novel list structure and pruning strategy. Futur Gener Comput Syst 68:346–360

    Article  Google Scholar 

  43. Yun U, Kim D, Yoon E, Fujita H (2018) Damped window based high average utility pattern mining over data streams. Knowl-Based Syst 144:188–205

    Article  Google Scholar 

  44. Yun U, Nam H, Lee G, Yoon E (2019) Efficient approach for incremental high utility pattern mining with indexed list structure. Futur Gener Comput Syst 95:221–239

    Article  Google Scholar 

  45. Zhang L, Yang S, Wu X, Cheng F, Xie Y, Lin Z (2019) An indexed set representation based multi-objective evolutionary approach for mining diversified top-k high utility patterns. Eng Appl Artif Intell 77:9–20

    Article  Google Scholar 

Download references

Acknowledgements

This research was supported by the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (NRF No. 2021R1A2C1009388).

Author information

Authors and Affiliations

Authors

Contributions

Yoonji Baek and Heonho Kim helped in writing drafts, development (modification) of algorithms; Hanju Kim, Myungha Cho, Hyeonmo Kim, and Chanhee Kee helped in writing draft and revision, validations of the proposed approach; Taewoong Ryu helped in writing drafts, validations of the proposed approach Bay Vo, Vincent W. Gan, Philippe Fournier, Jerry Chun-Wei Lin, and Witold Pedrycz done critical review, validations of the proposed approach; Unil Yun helped in writing drafts, conceptualizations of the ideas of the proposed approach, funding acquisition, project administration, and validations of the proposed approach

Corresponding author

Correspondence to Unil Yun.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The first authors are Yoonji Baek, Hanju Kim, Myungha Cho (: First author)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Baek, Y., Kim, H., Cho, M. et al. An efficient approach for incremental erasable utility pattern mining from non-binary data. Knowl Inf Syst 66, 5919–5958 (2024). https://doi.org/10.1007/s10115-024-02185-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-024-02185-5

Keywords