Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Correlated utility-based pattern mining

Published: 01 December 2019 Publication History
  • Get Citation Alerts
  • Abstract

    Recently, a new research field called utility-oriented mining has attracted great attention. However, previous studies have shown a limitation in that they rarely consider the inherent correlation of items among patterns. For example, considering the purchase behaviors of consumers, a high-utility group of products (w.r.t. multi-products) may contain several very high-utility products with some low-utility products. However, it is considered to be a valuable pattern even if this behavior/pattern may not be highly correlated, or even if it happens by chance. In light of these challenges, we propose an efficient utility-mining approach, called non-redundant Co rrelated high- U tility P attern M iner (CoUPM) by considering the positive correlation and profitable value. The derived patterns with high utility and strong positive correlation can lead to more insightful availability than those patterns that only have high profitable values. The utility-list structure is revised and applied to store the necessary information of both correlation and utility. Several pruning strategies are further developed to improve the efficiency for discovering the desired patterns. Experimental results showed that the non-redundant correlated high-utility patterns have more effectiveness than some other kinds of interesting patterns. Moreover, the efficiency of the proposed CoUPM algorithm significantly outperformed the state-of-the-art algorithm.

    References

    [1]
    R. Agrawal, T. Imieliński, A. Swami, Mining association rules between sets of items in large databases, ACM SIGMOD Record, 22, ACM, 1993, pp. 207–216.
    [2]
    R. Agrawal, R. Srikant, et al., Fast algorithms for mining association rules, Proceedings of the 20th International Conference on Very Large Data Bases, 1215, 1994, pp. 487–499.
    [3]
    C.F. Ahmed, S.K. Tanbeer, B.S. Jeong, H.J. Choi, A framework for mining interesting high utility patterns with a strong frequency affinity, Inf. Sci. 181 (21) (2011) 4878–4894.
    [4]
    C.-F. Ahmed, S.-K. Tanbeer, B.-S. Jeong, Y.-K. Lee, Efficient tree structures for high utility pattern mining in incremental databases, IEEE Trans. Knowl. Data Eng. 21 (12) (2009) 1708–1721.
    [5]
    S. Brin, R. Motwani, C. Silverstein, Beyond market baskets: generalizing association rules to correlations, ACM SIGMOD Record, 26, ACM, 1997, pp. 265–276.
    [6]
    M.-S. Chen, J. Han, P.S. Yu, Data mining: an overview from a database perspective, IEEE Trans. Knowl. Data Eng. 8 (6) (1996) 866–883.
    [7]
    P. Fournier-Viger, J.C.-W. Lin, T. Dinh, H.B. Le, Mining correlated high-utility itemsets using the bond measure, Proceedings of the International Conference on Hybrid Artificial Intelligence Systems, Springer, 2016, pp. 53–65.
    [8]
    P. Fournier-Viger, C.-W. Wu, S. Zida, V.S. Tseng, FHM: faster high-utility itemset mining using estimated utility co-occurrence pruning, Proceedings of the International Symposium on Methodologies for Intelligent Systems, Springer, 2014, pp. 83–92.
    [9]
    W. Gan, J.C.-W. Lin, H.-C. Chao, S.L. Wang, P.S. Yu, Privacy preserving utility mining: a survey, Proceedings of the IEEE International Conference on Big Data, IEEE, 2018, pp. 2617–2626.
    [10]
    W. Gan, J.C.-W. Lin, H.-C. Chao, J. Zhan, Data mining in distributed environment: a survey, Wiley Interdiscip. Rev. 7 (6) (2017) e1216.
    [11]
    W. Gan, J.C.-W. Lin, P. Fournier-Viger, H.-C. Chao, H. Fujita, Extracting non-redundant correlated purchase behaviors by utility measure, Knowl.-Based Syst. 143 (2018) 30–41.
    [12]
    W. Gan, J.C.-W. Lin, P. Fournier-Viger, H.-C. Chao, T.-P. Hong, H. Fujita, A survey of incremental high-utility itemset mining, Wiley Interdiscip. Rev. 8 (2) (2018) e1242.
    [13]
    W. Gan, J.C.-W. Lin, P. Fournier-Viger, H.-C. Chao, V.S. Tseng, P.S. Yu, A survey of utility-oriented pattern mining, arXiv:1805.10511 (2018d).
    [14]
    W. Gan, J.C.-W. Lin, P. Fournier-Viger, H.-C. Chao, P.S. Yu, A survey of parallel sequential pattern mining, ACM Trans. Knowl. Discov. Data 13 (3) (2019) 25.
    [15]
    W. Gan, J.C.-W. Lin, P. Fournier-Viger, H.-C. Chao, P.S. Yu, HUOPM: high-utility occupancy pattern mining, IEEE Trans. Cybern. (2019),.
    [16]
    W. Gan, J.C.-W. Lin, P. Fournier-Viger, H.-C. Chao, J. Zhan, Mining of frequent patterns with multiple minimum supports, Eng. Appl. Artif.Intell. 60 (2017) 83–96.
    [17]
    L. Geng, H.J. Hamilton, Interestingness measures for data mining: a survey, ACM Comput. Surv. 38 (3) (2006) 9.
    [18]
    J. Han, J. Pei, Y. Yin, R. Mao, Mining frequent patterns without candidate generation: a frequent-pattern tree approach, Data Mining Knowl. Discov. 8 (1) (2004) 53–87.
    [19]
    W.Y. Kim, Y.K. Lee, J. Han, CCMine: efficient mining of confidence-closed correlated patterns, Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining, Springer, 2004, pp. 569–579.
    [20]
    S. Krishnamoorthy, Pruning strategies for mining high utility itemsets, Expert Syst. Appl. 42 (5) (2015) 2371–2381.
    [21]
    S. Kulczyński, Die pflanzenassoziationen der pieninen, Imprimerie de l’Université, 1928.
    [22]
    G.-C. Lan, T.-P. Hong, V.S. Tseng, S.-L. Wang, Applying the maximum utility measure in high utility sequential pattern mining, Expert Syst. Appl. 41 (11) (2014) 5071–5081.
    [23]
    C.-W. Lin, T.-P. Hong, W.-H. Lu, An effective tree structure for mining high utility itemsets, Expert Syst. Appl. 38 (6) (2011) 7419–7424.
    [24]
    J.C.-W. Lin, W. Gan, P. Fournier-Viger, T.-P. Hong, H.-C. Chao, FDHUP: fast algorithm for mining discriminative high utility patterns, Knowl. Inf. Syst. 51 (3) (2017) 873–909.
    [25]
    J.C.-W. Lin, W. Gan, P. Fournier-Viger, T.-P. Hong, V.S. Tseng, Efficient algorithms for mining high-utility itemsets in uncertain databases, Knowl.-Based Syst. 96 (2016) 171–187.
    [26]
    J.C.-W. Lin, W. Gan, P. Fournier-Viger, T.-P. Hong, V.S. Tseng, Efficiently mining uncertain high-utility itemsets, Soft Comput. 21 (11) (2017) 2801–2820.
    [27]
    J.C.-W. Lin, W. Gan, T.-P. Hong, A fast updated algorithm to maintain the discovered high-utility itemsets for transaction modification, Adv. Eng. Inform. 29 (3) (2015) 562–574.
    [28]
    J.C.-W. Lin, W. Gan, T.-P. Hong, A fast maintenance algorithm of the discovered high-utility itemsets with transaction deletion, Intell. Data Anal. 20 (4) (2016) 891–913.
    [29]
    J.C.-W. Lin, W. Gan, T.-P. Hong, V.S. Tseng, Efficient algorithms for mining up-to-date high-utility patterns, Adv. Eng. Inform. 29 (3) (2015) 648–661.
    [30]
    J.C.-W. Lin, L. Yang, P. Fournier-Viger, T.-P. Hong, M. Voznak, A binary PSO approach to mine high-utility itemsets, Soft Comput. 21 (17) (2017) 5103–5121.
    [31]
    Y.-C. Lin, C.-W. Wu, V.S. Tseng, Mining high utility itemsets in big data, Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining, Springer, 2015, pp. 649–661.
    [32]
    Y.-F. Lin, C.-W. Wu, C.-F. Huang, V.S. Tseng, Discovering utility-based episode rules in complex event sequences, Expert Syst. Appl. 42 (12) (2015) 5303–5314.
    [33]
    M. Liu, J. Qu, Mining high utility itemsets without candidate generation, Proceedings of the 21st ACM International Conference on Information and Knowledge Management, ACM, 2012, pp. 55–64.
    [34]
    Y. Liu, W.-K. Liao, A. Choudhary, A two-phase algorithm for fast discovery of high utility itemsets, Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining, Springer, 2005, pp. 689–695.
    [35]
    T. Mai, B. Vo, L.T. Nguyen, A lattice-based approach for mining high utility association rules, Inf. Sci. 399 (2017) 81–97.
    [36]
    A. Marshall, From principles of economics, Readings in the Economics of the Division of Labor: The Classical Tradition, World Scientific, 2005, pp. 195–215.
    [37]
    E.R. Omiecinski, Alternative interest measures for mining associations in databases, IEEE Trans. Knowl. Data Eng. (1) (2003) 57–69.
    [38]
    R. Rymon, Search through systematic set enumeration, Proceeding of the 3rd International Conference on Principles of Knowledge Representation and Reasoning, 1992, pp. 539–550.
    [39]
    V.S. Tseng, B.-E. Shie, C.-W. Wu, P.S. Yu, Efficient algorithms for mining high utility itemsets from transactional databases, IEEE Trans. Knowl. Data Eng. 25 (8) (2013) 1772–1786.
    [40]
    V.S. Tseng, C.-W. Wu, P. Fournier-Viger, P.S. Yu, Efficient algorithms for mining the concise and lossless representation of high utility itemsets, IEEE Trans. Knowl. Data Eng. 27 (3) (2015) 726–739.
    [41]
    V.S. Tseng, C.-W. Wu, P. Fournier-Viger, P.S. Yu, Efficient algorithms for mining top-k high utility itemsets, IEEE Trans. Knowl. Data Eng. 28 (1) (2016) 54–67.
    [42]
    V.S. Tseng, C.-W. Wu, B.-E. Shie, P.S. Yu, UP-Growth: an efficient algorithm for high utility itemset mining, Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, 2010, pp. 253–262.
    [43]
    J. Wu, S. Zhu, H. Liu, G. Xia, Cosine interesting pattern discovery, Inf. Sci. 184 (1) (2012) 176–195.
    [44]
    J.M.T. Wu, J. Zhan, J.C.-W. Lin, An ACO-based approach to mine high-utility itemsets, Knowl.-Based Syst. 116 (2017) 102–113.
    [45]
    T. Wu, Y. Chen, J. Han, Re-examination of interestingness measures in pattern mining: a unified framework, Data Mining Knowl. Discov. 21 (3) (2010) 371–397.
    [46]
    H. Yao, H.J. Hamilton, Mining itemset utilities from transaction databases, Data Knowl. Eng. 59 (3) (2006) 603–626.
    [47]
    S. Zida, P. Fournier-Viger, J.C.-W. Lin, C.-W. Wu, V.S. Tseng, EFIM: a fast and memory efficient algorithm for high-utility itemset mining, Knowl. Inf. Syst. 51 (2) (2017) 595–625.

    Cited By

    View all

    Index Terms

    1. Correlated utility-based pattern mining
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image Information Sciences: an International Journal
        Information Sciences: an International Journal  Volume 504, Issue C
        Dec 2019
        602 pages

        Publisher

        Elsevier Science Inc.

        United States

        Publication History

        Published: 01 December 2019

        Author Tags

        1. Economic
        2. Utility mining
        3. Positive correlation
        4. Pruning strategy

        Qualifiers

        • Research-article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 09 Aug 2024

        Other Metrics

        Citations

        Cited By

        View all

        View Options

        View options

        Get Access

        Login options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media