Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

An efficient algorithm for mining high utility patterns from incremental databases with one database scan

Published: 15 May 2017 Publication History

Abstract

High utility pattern mining has been actively researched as one of the significant topics in the data mining field since this approach can solve the limitation of traditional pattern mining that cannot fully consider characteristics of real world databases. Moreover, database volumes have been bigger gradually in various applications such as sales data of retail markets and connection information of web services, and general methods for static databases are not suitable for processing dynamic databases and extracting useful information from them. Although incremental utility pattern mining approaches have been suggested, previous approaches need at least two scans for incremental utility pattern mining irrespective of using any structure. However, the approaches with multiple scans are actually not adequate for stream environments. In this paper, we propose an efficient algorithm for mining high utility patterns from incremental databases with one database scan based on a list-based data structure without candidate generation. Experimental results with real and synthetic datasets show that the proposed algorithm outperforms previous one phase construction methods with candidate generation.

References

[1]
P. Fournier-Viger, J.C.-W. Lin, R. Kiran, Y. Koh, R. Thomas, A survey of sequential pattern mining, Data Sci. Pattern Recognit., 1 (2017) 54-77.
[2]
J. Su, W. Chang, V. Tseng, Integrated mining of social and collaborative information for music recommendation, Data Sci. Pattern Recognit., 1 (2017) 13-30.
[3]
G. Lee, U. Yun, A new efficient approach for mining uncertain frequent patterns using minimum data structure without false positives, Future Generation Comp. Syst., 68 (2017) 89-110.
[4]
D. Meana-Llorin, C. Garca, V. Garca-Daz, B. G-Bustelo, N. Garcia-Fernandez, SenseQ: replying questions of social networks users by using a wireless sensor network based on sensor relationships, Data Sci. Pattern Recognit., 1 (2017) 1-12.
[5]
U. Yun, D. Kim, Mining of high average-utility itemsets using novel list structure and pruning strategy, Future Generation Comp. Syst., 68 (2017) 346-360.
[6]
R. Agrawal, R. Srikant, Fast algorithms for mining association rules, in: Proceedings of the 20th International Conference on Very Large Data Bases (VLDB 1994), 1994, pp. 487-499.
[7]
L. Chen, Q. Mei, Mining frequent items in data stream using time fading model, Inf. Sci., 257 (2014) 54-69.
[8]
G. Lee, U. Yun, H. Ryang, D. Kim, Erasable itemset mining over incremental databases with weight conditions, Eng. Appl. of AI, 52 (2016) 213-234.
[9]
G. Lee, U. Yun, H. Ryang, D. Kim, Approximate maximal frequent pattern mining with weight conditions and error tolerance, IJPRAI, 30 (2016) 1-42.
[10]
U. Yun, D. Kim, H. Ryang, G. Lee, K.-M. Lee, Mining recent high average utility patterns based on sliding window from stream data, J. Intell. Fuzzy Syst., 30 (2016) 3605-3617.
[11]
U. Yun, D. Kim, H. Ryang, G. Lee, K.-M. Lee, Mining recent high average utility patterns based on sliding window from stream data, J. Intell. Fuzzy Syst., 30 (2016) 3605-3617.
[12]
C.F. Ahmed, S.K. Tanbeer, B.-S. Jeong, Y.-K. Lee, H.-J. Choi, Single-pass incremental and interactive mining for weighted frequent patterns, Expert Syst. Appl., 39 (2012) 7976-7994.
[13]
J. Han, J. Pei, Y. Yin, Mining frequent patterns without candidate generation, in: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, 2000, pp. 1-12.
[14]
L. Troiano, G. Scibelli, Mining frequent itemsets in data streams within a time horizon, Data Knowl. Eng., 89 (2014) 21-37.
[15]
U. Yun, G. Lee, Incremental mining of weighted maximal frequent itemsets from dynamic databases, Expert Syst. Appl., 54 (2016) 304-327.
[16]
C.F. Ahmed, S.K. Tanbeer, B.-S. Jeong, H.-J. Choi, Interactive mining of high utility patterns over data streams, Expert Syst. Appl., 39 (2012) 11979-11991.
[17]
L. Feng, L. Wang, B. Jin, UT-Tree: efficient mining of high utility itemsets from data streams, Intell. Data Anal., 17 (2013) 585-602.
[18]
D. Kim, U. Yun, Efficient mining of high utility pattern with considering of rarity and length, Appl. Intell., 45 (2016) 152-173.
[19]
J.C.-W. Lin, T. Li, P. Fournier-Viger, T.-P. Hong, J. Zhan, M. Voznak, An efficient algorithm to mine high average-utility itemsets, Adv. Eng. Inform., 30 (2016) 233-243.
[20]
H. Ryang, U. Yun, High utility pattern mining over data streams with sliding window technique, Expert Syst. Appl., 57 (2016) 214-231.
[21]
H. Ryang, U. Yun, K. Ryu, Fast algorithm for high utility pattern mining with the sum of item quantities, Intell. Data Anal., 20 (2016) 395-415.
[22]
J. Sahoo, A.K. Das, A. Goswami, An efficient approach for mining association rules from high utility itemsets, Expert Syst. Appl., 42 (2015) 5754-5778.
[23]
W. Song, C. Wang, J. Li, Binary partition for itemsets expansion in mining high utility itemsets, Intell. Data Anal., 20 (2016) 915-931.
[24]
V.S. Tseng, C.-W. Wu, P. Fournier-Viger, P.S. Yu, Efficient algorithms for mining the concise and lossless representation of high utility itemsets, IEEE Trans. Knowl. Data Eng., 27 (2015) 726-739.
[25]
Y. Liu, W.-K. Liao, A.N. Choudhary, A two-phase algorithm for fast discovery of high utility itemsets, 2005.
[26]
M. Liu, J.-F. Qu, Mining high utility itemsets without candidate generation, in: International Conference on Information and Knowledge Management (CIKM 2012), 2012, pp. 55-64.
[27]
J. Liu, K. Wang, B.C.M. Fung, Mining high utility patterns in one phase without generating candidates, IEEE Trans. Knowl. Data Eng., 28 (2016) 1245-1257.
[28]
J. Liu, K. Wang, B.C.M. Fung, Direct discovery of high utility itemsets without candidate generation, in: Proceedings of the 2012 IEEE International Conference on Data Mining (ICDM 2012), 2012, pp. 984-989.
[29]
C.-W. Lin, T.-P. Hong, G.-C. Lan, J.-W. Wong, W.-Y. Lin, Efficient updating of discovered high-utility itemsets for transaction deletion in dynamic databases, Adv. Eng. Inform., 29 (2015) 16-27.
[30]
C.-W. Lin, G.-C. Lan, T.-P. Hong, Mining high utility itemsets for transaction deletion in a dynamic database, Intell. Data Anal., 19 (2015) 43-55.
[31]
C.-W. Lin, T.-P. Hong, G.-C. Lan, J.-W. Wong, W.-Y. Lin, Incrementally mining high utility patterns based on pre-large concept, Appl. Intell., 40 (2014) 343-357.
[32]
Y.-C. Li, J.-S. Yeh, C.-C. Chang, Isolated items discarding strategy for discovering high utility itemsets, Data Knowl. Eng., 64 (2008) 198-217.
[33]
V.S. Tseng, B.-E. Shie, C.-W. Wu, P.S. Yu, Efficient algorithms for mining high utility itemsets from transactional databases, IEEE Trans. Knowl. Data Eng., 25 (2013) 1772-1786.
[34]
P. Fournier-Viger, C.-W. Wu, S. Zida, V.S. Tseng, FHM: faster high-utility itemset mining using estimated utility co-occurrence pruning, in: Proceedings of the 21st International Symposium on Methodologies for Intelligent Systems (ISMIS 2014), 2014, pp. 83-92.
[35]
S. Krishnamoorthy, Pruning strategies for mining high utility itemsets, Expert Syst. Appl., 42 (2015) 2371-2381.
[36]
J.-S. Yeh, C.-Y. Chang, Y.-T. Wang, Efficient algorithms for incremental utility mining, in: Proceedings of the 2nd International Conference on Ubiquitous Information Management and Communication (ICUIMC 2008), 2008, pp. 212-217.
[37]
C.F. Ahmed, S.K. Tanbeer, B.-S. Jeong, Y.-K. Lee, Efficient tree structures for high utility pattern mining in incremental databases, IEEE Trans. Knowl. Data Eng., 21 (2009) 1708-1721.
[38]
U. Yun, H. Ryang, Incremental high utility pattern mining with static and dynamic databases, Appl. Intell., 42 (2015) 323-352.
[39]
H.-T. Zheng, Z. Li, iCHUM: an efficient algorithm for high utility mining in incremental databases, in: Proceedings of the 8th International Conference on Knowledge Science, Engineering and Management (KSEM 2015), 2015, pp. 212-223.
[40]
J.C.-W. Lin, W. Gan, T.-P. Hong, B. Zhang, An incremental high-utility mining algorithm with transaction insertion, The Scientific World Journal (2015) 1-15.
[41]
P. Fournier-Viger, J.C.-W. Lin, T. Gueniche, P. Barhate, Efficient incremental high utility itemset mining, in: Proceedings of the 5th International Conference on ASE Big Data and SocialInformatics 2015, 2015.
[42]
Q.-H. Duong, B. Liao, P. Fournier-Viger, T.-L. Dam, An efficient algorithm for mining the top-k high utility itemsets, using novel threshold raising and pruning strategies, Knowl.-Based Syst., 104 (2016) 106-122.
[43]
G.-C. Lan, T.-P. Hong, V.S. Tseng, S.-L. Wang, Applying the maximum utility measure in high utility sequential pattern mining, Expert Syst. Appl., 41 (2014) 5071-5081.
[44]
J.C.-W. Lin, W. Gan, P. Fournier-Viger, T.-P. Hong, V.S. Tseng, Fast algorithms for mining high-utility itemsets with various discount strategies, Adv. Eng. Informa., 30 (2016) 109-126.
[45]
W. Song, Y. Liu, J. Li, Mining high utility itemsets by dynamically pruning the tree structure, Appl. Intell., 40 (2014) 29-43.
[46]
V.S. Tseng, C.-W. Wu, P. Fournier-Viger, P.S. Yu, Efficient algorithms for mining top-K high utility itemsets, IEEE Trans. Knowl. Data Eng., 28 (2016) 54-67.
[47]
J. Pisharath, Y. Liu, B. Ozisikyilmaz, R. Narayanan, W.K. Liao, A. Choudhary, G. Memik, NU-MineBench Version 2.0 Dataset and Technical Report. http://cucis.ece.northwestern.edu/projects/DMS>.

Cited By

View all
  • (2024)A scalable and flexible basket analysis system for big transaction data in SparkInformation Processing and Management: an International Journal10.1016/j.ipm.2023.10357761:2Online publication date: 1-Mar-2024
  • (2023)An advanced approach for incremental flexible periodic pattern mining on time-series dataExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.120697230:COnline publication date: 15-Nov-2023
  • (2023)An inventory-aware and revenue-based itemset placement framework for retail storesExpert Systems with Applications: An International Journal10.1016/j.eswa.2022.119404216:COnline publication date: 15-Apr-2023
  • Show More Cited By
  1. An efficient algorithm for mining high utility patterns from incremental databases with one database scan

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Knowledge-Based Systems
    Knowledge-Based Systems  Volume 124, Issue C
    May 2017
    207 pages

    Publisher

    Elsevier Science Publishers B. V.

    Netherlands

    Publication History

    Published: 15 May 2017

    Author Tags

    1. Data mining
    2. High utility patterns
    3. Incremental mining
    4. One database scan
    5. Utility mining

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 04 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)A scalable and flexible basket analysis system for big transaction data in SparkInformation Processing and Management: an International Journal10.1016/j.ipm.2023.10357761:2Online publication date: 1-Mar-2024
    • (2023)An advanced approach for incremental flexible periodic pattern mining on time-series dataExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.120697230:COnline publication date: 15-Nov-2023
    • (2023)An inventory-aware and revenue-based itemset placement framework for retail storesExpert Systems with Applications: An International Journal10.1016/j.eswa.2022.119404216:COnline publication date: 15-Apr-2023
    • (2023)High utility pattern mining algorithm over data streams using ext-list.Applied Intelligence10.1007/s10489-023-04925-653:22(27072-27095)Online publication date: 31-Aug-2023
    • (2023)FCHM-stream: fast closed high utility itemsets mining over data streamsKnowledge and Information Systems10.1007/s10115-023-01831-865:6(2509-2539)Online publication date: 3-Feb-2023
    • (2022)EHMINExpert Systems with Applications: An International Journal10.1016/j.eswa.2022.118214209:COnline publication date: 15-Dec-2022
    • (2022)Discovery of Interesting Itemsets for Web Service Composition Using Hybrid Genetic AlgorithmNeural Processing Letters10.1007/s11063-022-10793-x54:5(3913-3939)Online publication date: 1-Oct-2022
    • (2022)Incremental mining of high utility sequential patterns using MapReduce paradigmCluster Computing10.1007/s10586-021-03448-425:2(805-825)Online publication date: 1-Apr-2022
    • (2022)Mining closed high utility patterns with negative utility in dynamic databasesApplied Intelligence10.1007/s10489-022-03876-853:10(11750-11767)Online publication date: 10-Sep-2022
    • (2022)Efficient evolutionary computation model of closed high-utility itemset miningApplied Intelligence10.1007/s10489-021-03134-352:9(10604-10616)Online publication date: 1-Jul-2022
    • Show More Cited By

    View Options

    View options

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media