Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Efficient approach of high average utility pattern mining with indexed list-based structure in dynamic environments

Published: 14 March 2024 Publication History
  • Get Citation Alerts
  • Abstract

    Various studies on high utility pattern mining have been conducted to satisfy the emerging need to consider the characteristics of real-world databases, such as the importance and quantity of items. In the traditional utility-based framework, the mining result is influenced by the number of items in a pattern, or in some cases, single utilities of items. In order to overcome this drawback, high average utility pattern mining has been proposed. It provides more interesting results since it takes into account the average utility of patterns by considering their lengths. Methods based on this concept have emerged in recent years, including ones that target incremental environments. However, existing algorithms create an enormous number of candidate patterns or require complex operations during the mining process. To address this degradation, we propose a new and more efficient approach for mining high average utility patterns in dynamic environments. The proposed algorithm utilizes a data structure more efficient than previous ones, which takes the form of an indexed list. It also incorporates efficient realigning and mining techniques for handling incremental data and accurately mining results. Experimental results show the superiority of the proposed approach in terms of runtime, memory usage, scalability, and accuracy.

    References

    [1]
    M.R. Al-Bana, M.S. Farhan, N.-A.-H. Othman, An efficient spark-based hybrid frequent itemset mining algorithm for big data, Data 7 (1) (2022) 11.
    [2]
    H. Bui, T.-A. Nguyen-Hoang, B. Vo, H. Nguyen, T. Le, A sliding window-based approach for mining frequent weighted patterns over data streams, IEEE Access 9 (2021) 56318–56329.
    [3]
    S. Cai, L. Li, J. Chen, K. Zhao, G. Yuan, R. Sun, R.N.A. Sosu, L. Huang, MWFP-outlier: maximal weighted frequent-pattern-based approach for detecting outliers from uncertain weighted data streams, Inf. Sci. 591 (2022) 195–225.
    [4]
    C. Cariou, S. Le Moan, K. Chehdi, A novel mean-shift algorithm for data clustering, IEEE Access 10 (2022) 14575–14585.
    [5]
    C.-L. Fan, Data mining model for predicting the quality level and classification of construction projects, J. Intell. Fuzzy Syst. 42 (1) (2022) 139–153.
    [6]
    A. Gionis, H. Mannila, T. Mielikäinen, P. Tsaparas, Assessing data mining results via swap randomization, ACM Trans. Knowl. Discov. Data 1 (3) (2007) 14.
    [7]
    T.P. Hong, C.H. Lee, S.L. Wang, Effective utility mining with the measure of average utility, Expert Systems with Application 38 (7) (2011) 8259–8265.
    [8]
    H.M. Huynh, L.T.T. Nguyen, B. Vo, Z.K. Oplatková, P. Fournier-Viger, U. Yun, An efficient parallel algorithm for mining weighted clickstream patterns, Inf. Sci. 582 (2022) 349–368.
    [9]
    A.M. Khedr, Z.A. Aghbari, A.A. Ali, M. Eljamil, An efficient association rule mining from distributed medical databases for predicting heart diseases, IEEE Access 9 (2021) 15320–15333.
    [10]
    D. Kim, U. Yun, Efficient algorithm for mining high average-utility itemsets in incremental transaction databases, Appl. Intell. 47 (1) (2017) 114–131.
    [11]
    S. Kim, H. Kim, M. Cho, H. Kim, B. Vo, J.-C.-W. Lin, U. Yun, Efficient approach for mining high-utility patterns on incremental databases with dynamic profits, Knowl.-Based Syst. 282 (2023).
    [12]
    H. Kim, H. Kim, S. Kim, H. Kim, M. Cho, B. Vo, J. Chun-Wei Lin, U. Yun, An advanced approach for incremental flexible periodic pattern mining on time-series data, Expert Syst. Appl. 230 (2023).
    [13]
    H. Kim, C. Lee, T. Ryu, H. Kim, S. Kim, B. Vo, J.-C.-W. Lin, U. Yun, Pre-large based high utility pattern mining for transaction insertions in incremental database, Knowl.-Based Syst. 268 (2023).
    [14]
    H. Kim, T. Ryu, C. Lee, H. Kim, E. Yoon, B. Vo, J.-C.-W. Lin, U. Yun, EHMIN: Efficient approach of list based high-utility pattern mining with negative unit profits, Expert Syst. Appl. 209 (2022).
    [15]
    H. Kim, T. Ryu, C. Lee, H. Kim, T. Truong, P. Fournier-Viger, W. Pedrycz, U. Yun, Mining high occupancy patterns to analyze incremental data in intelligent systems, ISA Trans. 131 (2022) 460–475.
    [16]
    H. Kim, T. Ryu, C. Lee, S. Kim, B. Vo, J.-C.-W. Lin, U. Yun, Efficient method for mining high utility occupancy patterns based on indexed list structure, IEEE Access 11 (2023) 43140–43158.
    [17]
    J. Kim, U. Yun, E. Yoon, J.-C.-W. Lin, P. Fournier-Viger, One scan based high average-utility pattern mining in static and dynamic databases, Futur. Gener. Comput. Syst. 111 (2020) 143–158.
    [18]
    J. Kim, U. Yun, H. Kim, T. Ryu, J.-C.-W. Lin, P. Fournier-Vier, W. Pedrycz, Average utility driven data analytics on damped windows for intelligent systems with data streams, Int. J. Intell. Syst. 36 (10) (2021) 5741–5769.
    [19]
    H. Kim, U. Yun, Y. Baek, H. Kim, H. Nam, J.-C.-W. Lin, P. Fournier-Viger, Damped sliding based utility oriented pattern mining over stream data, Knowl.-Based Syst. 213 (2021).
    [20]
    H. Kim, U. Yun, Y. Baek, J. Kim, B. Vo, E. Yoon, H. Fujita, Efficient list based mining of high average utility patterns with maximum average pruning strategies, Inf. Sci. 543 (2021) 85–105.
    [21]
    C. Lee, Y. Baek, J.-C.-W. Lin, T. Truong, U. Yun, Advanced uncertainty based approach for discovering erasable product patterns, Knowl.-Based Syst. 241 (2022).
    [22]
    C. Lee, Y. Baek, T. Ryu, H. Kim, H. Kim, J. Chun-Wei Lin, B. Vo, U. Yun, An efficient approach for mining maximized erasable utility patterns, Inf. Sci. 609 (2022) 1288–1308.
    [23]
    C. Lee, T. Ryu, H. Kim, H. Kim, B. Vo, J.-C.-W. Lin, U. Yun, Efficient approach of sliding window-based high average-utility pattern mining with list structures, Knowl.-Based Syst. 256 (2022).
    [24]
    J.-C.-W. Lin, M. Pirouz, Y. Djenouri, C.-F. Cheng, U. Ahmed, Incrementally updating the high average-utility patterns with pre-large concept, Appl. Intell. 50 (11) (2020) 3788–3807.
    [25]
    Y. Liu, W.-K. Liao, and A.N. Choudhary, A two-phase algorithm for fast discovery of high utility itemsets, Advances in Knowledge Discovery and Data Mining (PAKDD 2005). (2005) 689–695.
    [26]
    Z. Liu, Y. Ma, H. Zheng, D. Liu, J. Liu, Human resource recommendation algorithm based on improved frequent itemset mining, Futur. Gener. Comput. Syst. 126 (2022) 284–288.
    [27]
    M. Liu and J.-F. Qu, Mining high utility itemsets without candidate generation, International Conference on Information and Knowledge Management (CIKM 2012). (2012) 55–64.
    [28]
    J.M. Luna, R.U. Kiran, P. Fournier-Viger, S. Ventura, Efficient mining of top-k high utility itemsets through genetic algorithms, Inf. Sci. 624 (2023) 529–553.
    [29]
    H. Nam, U. Yun, E. Yoon, J.-C.-W. Lin, Efficient approach for incremental weighted erasable pattern mining with list structure, Expert Syst. Appl. 143 (2020).
    [30]
    H. Nam, U. Yun, E. Yoon, J.-C.-W. Lin, Efficient approach of recent high utility stream pattern mining with indexed list structure and pruning strategy considering arrival times of transactions, Inf. Sci. 529 (2020) 1–27.
    [31]
    L.T.T. Nguyen, T. Mai, G.-H. Pham, U. Yun, B. Vo, An efficient method for mining high occupancy itemsets based on equivalence class and early pruning, Knowl.-Based Syst. 267 (2023).
    [32]
    T. Ryu, H. Kim, C. Lee, H. Kim, B. Vo, J.-C.-W. Lin, W. Pedrycz, U. Yun, Scalable and efficient approach for high temporal fuzzy utility pattern mining, IEEE Trans. Cybernetics. (2022) 1–14.
    [33]
    T. Ryu, U. Yun, C. Lee, J.-C.-W. Lin, W. Pedrycz, Occupancy-based utility pattern mining in dynamic environments of intelligent systems, Int. J. Intell. Syst. 37 (2022) 5477–5507.
    [34]
    S. Saleti, Incremental mining of high utility sequential patterns using MapReduce paradigm, Clust. Comput. 25 (2) (2022) 805–825.
    [35]
    K.K. Sethi, D. Ramesh, High average-utility itemset mining with multiple minimum utility threshold: a generalized approach, Eng. Appl. Artif. Intel. 96 (2020).
    [36]
    W. Thurachon, W. Kreesuradej, Incremental association rule mining with a fast incremental updating frequent pattern growth algorithm, IEEE Access 9 (2021) 55726–55741.
    [37]
    N.T. Tung, L.T.T. Nguyen, T.D.D. Nguyen, P. Fourier-Viger, N.-T. Nguyen, B. Vo, Efficient mining of cross-level high-utility itemsets in taxonomy quantitative databases, Inf. Sci. 587 (2022) 41–62.
    [38]
    V.V. Vu, M.T.H. Lam, T.T.M. Duong, L.T. Manh, T.T.T. Nguyen, L.V. Nguyen, U. Yun, V. Snasel, B. Vo, FTKHUIM: a fast and efficient method for mining top-K high-utility itemsets, IEEE Access 11 (2023) 104789–104805.
    [39]
    N. Vuong, B. Le, T.C. Truong, N.D. Phuong, Efficient algorithms for discovering high-utility patterns with strong frequency affinities, Expert Syst. Appl. 169 (2021).
    [40]
    L. Wang, Big data precision marketing approach under IoT cloud platform information mining, Comput. Intell. Neurosci. 2022 (4828108) (2022) 1–11.
    [41]
    J.-M.-T. Wu, J.-C.-W. Lin, M. Pirouz, P. Fournier-Viger, TUB-HAUPM: tighter upper bound for mining high average-utility patterns, IEEE Access 6 (2018) 18655–18669.
    [42]
    Y. Wu, R. Lei, Y. Li, L. Guo, X. Wu, HAOP-Miner: self-adaptive high-average utility one-off sequential pattern mining, Expert Syst. Appl. 184 (2021).
    [43]
    Y. Xun, L. Wang, H. Yang, J. Cai, Mining relevant partial periodic pattern of multi-source time series data, Inf. Sci. 615 (2022) 638–656.
    [44]
    Y. You, Data mining of regional economic analysis based on mobile sensor network technology, Journal of Sensors 2022 (3415055) (2022) 1–13.
    [45]
    U. Yun, D. Kim, Mining of high average-utility itemsets using novel list structure and pruning strategy, Futur. Gener. Comput. Syst. 68 (2017) 346–360.
    [46]
    U. Yun, D. Kim, E. Yoon, H. Fujita, Damped window based high average utility pattern mining over data streams, Knowl.-Based Syst. 144 (2018) 188–205.
    [47]
    U. Yun, H. Nam, G. Lee, E. Yoon, Efficient approach for incremental high utility pattern mining with indexed list structure, Futur. Gener. Comput. Syst. 95 (2019) 211–239.
    [48]
    U. Yun, H. Ryang, G. Lee, H. Fujita, An efficient algorithm for mining high utility patterns from incremental databases with one database scan, Knowl.-Based Syst. 124 (2017) 188–206.
    [49]
    X. Zhang, F. Lai, G. Chen, W. Gan, Mining high-utility sequences with positive and negative values, Inf. Sci. 637 (2023).
    [50]
    X. Zhang, Y. Qi, G. Chen, W. Gan, P. Fournier-Viger, Fuzzy-driven periodic frequent pattern mining, Inf. Sci. 618 (2022) 253–269.

    Index Terms

    1. Efficient approach of high average utility pattern mining with indexed list-based structure in dynamic environments
              Index terms have been assigned to the content through auto-classification.

              Recommendations

              Comments

              Information & Contributors

              Information

              Published In

              cover image Information Sciences: an International Journal
              Information Sciences: an International Journal  Volume 657, Issue C
              Feb 2024
              1195 pages

              Publisher

              Elsevier Science Inc.

              United States

              Publication History

              Published: 14 March 2024

              Author Tags

              1. Data mining
              2. High average utility patterns
              3. Dynamic data mining
              4. Indexed list based structure
              5. Pattern pruning

              Qualifiers

              • Research-article

              Contributors

              Other Metrics

              Bibliometrics & Citations

              Bibliometrics

              Article Metrics

              • 0
                Total Citations
              • 0
                Total Downloads
              • Downloads (Last 12 months)0
              • Downloads (Last 6 weeks)0
              Reflects downloads up to 27 Jul 2024

              Other Metrics

              Citations

              View Options

              View options

              Get Access

              Login options

              Media

              Figures

              Other

              Tables

              Share

              Share

              Share this Publication link

              Share on social media