Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Mining weighted sequential patterns in incremental uncertain databases

Published: 01 January 2022 Publication History

Highlights

An efficient algorithm named FUWS, to mine weighted uncertain sequences.
The uWSInc and uWSInc+ to mine uncertain sequences in incremental databases.
Upper bounds: wgt cap, expSup cap, &wExpSup cap to maintain the anti-monotone property.
A hierarchical index structure, USeq-Trie to maintain weighted uncertain sequences.
First known methods to work for weighted sequences in incremental uncertain database.

Abstract

Due to the rapid development of science and technology, the importance of imprecise, noisy, and uncertain data is increasing at an exponential rate. Thus, mining patterns in uncertain databases have drawn the attention of researchers. Moreover, frequent sequences of items from these databases need to be discovered for meaningful knowledge with great impact. In many real cases, weights of items and patterns are introduced to find interesting sequences as a measure of importance. Hence, a constraint of weight needs to be handled while mining sequential patterns. Besides, due to the dynamic nature of databases, mining important information has become more challenging. Instead of mining patterns from scratch after each increment, incremental mining algorithms utilize previously mined information to update the result immediately. Several algorithms exist to mine frequent patterns and weighted sequences from incremental databases. However, these algorithms are confined to mine the precise ones. Therefore, in this work, we develop an algorithm to mine frequent sequences in an uncertain database in this work. Furthermore, we propose two new techniques for mining when the database is incremental. Extensive experiments have been conducted for performance evaluation. The analysis showed the efficiency of our proposed framework.

References

[1]
C.C. Aggarwal, Y. Li, J. Wang, J. Wang, Frequent pattern mining with uncertain data, in: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, 2009, pp. 29–38.
[2]
R. Agrawal, R. Srikant, et al., Fast algorithms for mining association rules, in: Proc. 20th int. conf. very large data bases, VLDB, vol. 1215, 1994, pp. 487–499.
[3]
A.U. Ahmed, C.F. Ahmed, M. Samiullah, N. Adnan, C.K.-S. Leung, Mining interesting patterns from uncertain databases, Inf. Sci. 354 (2016) 60–85.
[4]
C.F. Ahmed, S.K. Tanbeer, B.-S. Jeong, Y.-K. Lee, H.-J. Choi, Single-pass incremental and interactive mining for weighted frequent patterns, Expert Syst. Appl. 39 (9) (2012) 7976–7994.
[5]
Y. Chen, J. Guo, Y. Wang, Y. Xiong, Y. Zhu, Incremental mining of sequential patterns using prefix tree, in: Pacific-Asia Conference on Knowledge Discovery and Data Mining, Springer, 2007, pp. 433–440.
[6]
H. Cheng, X. Yan, J. Han, IncSpan: incremental mining of sequential patterns in large database, in: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, 2004, pp. 527–532.
[7]
D.W. Cheung, J. Han, V.T. Ng, C. Wong, Maintenance of discovered association rules in large databases: An incremental updating technique, in: Proceedings of the twelfth international conference on data engineering, IEEE, 1996, pp. 106–114.
[8]
U. Ahmed, J.C.-W. Lin, G. Srivastava, R. Yasin, Y. Djenouri, An evolutionary model to mine high expected utility patterns from uncertain databases, IEEE Trans. Emerg. Top. Comput. Intell. 5 (1) (2020) 19–28.
[9]
Chien-Ming Chen, L. Chen, W. Gan, L. Qiu, W. Ding, Discovering high utility-occupancy patterns from uncertain data, Inf. Sci. 546 (2021) 1208–1229.
[10]
W. Gan, J.C.-W. Lin, P. Fournier-Viger, H.-C. Chao, J.M.-T. Wu, J. Zhan, Extracting recent weighted-based patterns from uncertain temporal databases, Eng. Appl. Artif. Intell. 61 (2017) 161–172.
[11]
Gautam Srivastava, J.C. Lin, A. Jolfaei, Y. Li, Y. Djenouri, Uncertain-driven analytics of sequence data in IoCV environments, IEEE Trans. Intell. Transp. Syst. 22 (8) (2021) 5403–5414.
[12]
J.C.-W. Lin, W. Gan, P. Fournier-Viger, H.-C. Chao, T.-P. Hong, Efficiently mining frequent itemsets with weight and recency constraints, Appl. Intell. 47 (3) (2017) 769–792.
[13]
Razieh Davashi, ILUNA: single-pass incremental method for uncertain frequent pattern mining without false positives, Inf. Sci. 564 (2021) 1–26.
[14]
Tin C. Truong, H.V. Duong, B. Le, P. Fournier-Viger, EHAUSM: an efficient algorithm for high average utility sequence mining, Inf. Sci. 515 (2020) 302–323.
[15]
Wensheng Gan, J.C. Lin, J. Zhang, H. Chao, H. Fujita, P.S. Yu, ProUM: projection-based utility mining on sequence data, Inf. Sci. 513 (2020) 222–240.
[16]
P. Fournier-Viger, J.C.-W. Lin, R.U. Kiran, Y.S. Koh, R. Thomas, A survey of sequential pattern mining, Data Sci. Pattern Recogn. 1 (1) (2017) 54–77.
[17]
W. Gan, J.C.-W. Lin, P. Fournier-Viger, H.-C. Chao, T.-P. Hong, H. Fujita, A survey of incremental high-utility itemset mining, Wiley Interdiscip. Rev.: Data Min. Knowl. Discov. 8 (2) (2018).
[18]
W. Gan, J.C.-W. Lin, P. Fournier-Viger, H.-C. Chao, V.S. Tseng, Mining high-utility itemsets with both positive and negative unit profits from uncertain databases, in: Pacific-Asia Conference on Knowledge Discovery and Data Mining, Springer, 2017, pp. 434–446.
[19]
W. Gan, J.C.-W. Lin, P. Fournier-Viger, H.-C. Chao, P.S. Yu, A survey of parallel sequential pattern mining, ACM Trans. Knowl. Discov. Data 13 (3) (2019) 1–34.
[20]
J. Han, J. Pei, Y. Yin, R. Mao, Mining frequent patterns without candidate generation: a frequent-pattern tree approach, Data Min. Knowl. Discov. 8 (1) (2004) 53–87.
[21]
S.Z. Ishita, F. Noor, C.F. Ahmed, An efficient approach for mining weighted sequential patterns in dynamic databases, in: Industrial Conference on Data Mining, Springer, 2018, pp. 215–229.
[22]
G. Lee, U. Yun, Single-pass based efficient erasable pattern mining using list data structure on dynamic incremental databases, Future Gener. Comput. Syst. 80 (2018) 12–28.
[23]
G. Lee, U. Yun, H. Ryang, An uncertainty-based approach: frequent itemset mining from uncertain data with different item importance, Knowl.-Based Syst. 90 (2015) 239–256.
[24]
C.K.-S. Leung, R.K. MacKinnon, F. Jiang, Reducing the search space for big data mining for interesting patterns from uncertain data, in: 2014 IEEE International Congress on Big Data, IEEE, 2014, pp. 315–322.
[25]
C.K.-S. Leung, M.A.F. Mateo, D.A. Brajczuk, A tree-based approach for frequent pattern mining from uncertain data, in: Pacific-Asia Conference on Knowledge Discovery and Data Mining, Springer, 2008, pp. 653–661.
[26]
C.K.-S. Leung, S.K. Tanbeer, Fast tree-based mining of frequent itemsets from uncertain data, in: International Conference on Database Systems for Advanced Applications, Springer, 2012, pp. 272–287.
[27]
C.K.-S. Leung, S.K. Tanbeer, PUF-tree: a compact tree structure for frequent pattern mining of uncertain data, in: Pacific-Asia Conference on Knowledge Discovery and Data Mining, Springer, 2013, pp. 13–25.
[28]
C.-S. Leung, Q.I. Khan, T. Hoque, CanTree: a tree structure for efficient incremental mining of frequent patterns, in: Fifth IEEE International Conference on Data Mining (ICDM’05), IEEE, 2005, pp. 274–281.
[29]
J.C.-W. Lin, W. Gan, P. Fournier-Viger, T.-P. Hong, V.S. Tseng, Weighted frequent itemset mining over uncertain databases, Appl. Intell. 44 (1) (2016) 232–250.
[30]
J.C.-W. Lin, T.-P. Hong, W. Gan, H.-Y. Chen, S.-T. Li, Incrementally updating the discovered sequential patterns based on pre-large concept, Intell. Data Anal. 19 (5) (2015) 1071–1089.
[31]
J.C.-W. Lin, T. Li, M. Pirouz, J. Zhang, P. Fournier-Viger, High average-utility sequential pattern mining based on uncertain databases, Knowl. Inf. Syst. 62 (3) (2020) 1199–1228.
[32]
J.C.-W. Lin, M. Pirouz, Y. Djenouri, C.-F. Cheng, U. Ahmed, Incrementally updating the high average-utility patterns with pre-large concept, Appl. Intell. 50 (11) (2020) 3788–3807.
[33]
J.C.-W. Lin, J.M.-T. Wu, P. Fournier-Viger, C.-H. Chen, T. Li, A project-based PMiner algorithm in uncertain databases, in: 2019 International Conference on Technologies and Applications of Artificial Intelligence (TAAI), IEEE, 2019, pp. 1–5.
[34]
M. Muzammal, R. Raman, On probabilistic models for uncertain sequential pattern mining, in: International Conference on Advanced Data Mining and Applications, Springer, 2010, pp. 60–72.
[35]
M. Muzammal, R. Raman, Mining sequential patterns from probabilistic databases, in: Pacific-Asia Conference on Knowledge Discovery and Data Mining, Springer, 2011, pp. 210–221.
[36]
H. Nam, U. Yun, E. Yoon, J.C.-W. Lin, Efficient approach for incremental weighted erasable pattern mining with list structure, Expert Syst. Appl. 143 (2020).
[37]
S.N. Nguyen, X. Sun, M.E. Orlowska, Improvements of IncSpan: incremental mining of sequential patterns in large database, in: Pacific-Asia Conference on Knowledge Discovery and Data Mining, Springer, 2005, pp. 442–451.
[38]
J. Pei, J. Han, B. Mortazavi-Asl, J. Wang, H. Pinto, Q. Chen, U. Dayal, M.-C. Hsu, Mining sequential patterns by pattern-growth: the PrefixSpan approach, IEEE Trans. Knowl. Data Eng. 16 (11) (2004) 1424–1440.
[39]
M.M. Rahman, C.F. Ahmed, C.K.-S. Leung, Mining weighted frequent sequences in uncertain databases, Inf. Sci. 479 (2019) 76–100.
[40]
R.A. Rizvee, M.F. Arefin, C.F. Ahmed, Tree-Miner: Mining sequential patterns from SP-Tree, in: PAKDD, Springer, 2020, pp. 44–56.
[41]
R. Srikant, R. Agrawal, Mining sequential patterns: Generalizations and performance improvements, in: International Conference on Extending Database Technology, Springer, 1996, pp. 1–17.
[42]
S.K. Tanbeer, C.F. Ahmed, B.-S. Jeong, Y.-K. Lee, Efficient single-pass frequent pattern mining using a prefix-tree, Inf. Sci. 179 (5) (2009) 559–583.
[43]
T. Truong-Chi, P. Fournier-Viger, A survey of high utility sequential pattern mining, in: High-Utility Pattern Mining, Springer, 2019, pp. 97–129.
[44]
J.-Z. Wang, J.-L. Huang, On incremental high utility sequential pattern mining, ACM Transactions on Intelligent Systems and Technology (TIST) 9 (5) (2018) 1–26.
[45]
L. Wang, D.W.-L. Cheung, R. Cheng, S.D. Lee, X.S. Yang, Efficient mining of frequent item sets on large uncertain databases, IEEE Trans. Knowl. Data Eng. 24 (12) (2011) 2170–2183.
[46]
J.M.-T. Wu, Q. Teng, J.C.-W. Lin, C.-F. Cheng, Incrementally updating the discovered high average-utility patterns with the pre-large concept, IEEE Access 8 (2020) 66788–66798.
[47]
D. Yan, Z. Zhao, W. Ng, S. Liu, Probabilistic convex hull queries over uncertain data, IEEE Trans. Knowl. Data Eng. 27 (3) (2014) 852–865.
[48]
U. Yun, Efficient mining of weighted interesting patterns with a strong weight and/or support affinity, Inf. Sci. 177 (17) (2007) 3477–3499.
[49]
U. Yun, A new framework for detecting weighted sequential patterns in large sequence databases, Knowl.-Based Syst. 21 (2) (2008) 110–122.
[50]
Z. Zhao, D. Yan, W. Ng, Mining probabilistically frequent sequential patterns in large uncertain databases, IEEE Trans. Knowl. Data Eng. 26 (5) (2013) 1171–1184.

Cited By

View all
  • (2023)An improved algorithm for frequent sequence pattern mining based on PrefixSpan-ComplexPrefixSpanProceedings of the 4th International Conference on Artificial Intelligence and Computer Engineering10.1145/3652628.3652636(48-52)Online publication date: 17-Nov-2023
  • (2023)Weighted Statistically Significant Pattern MiningCompanion Proceedings of the ACM Web Conference 202310.1145/3543873.3587586(1276-1285)Online publication date: 30-Apr-2023
  • (2022)Q-Eclat: Vertical Mining of Interesting Quantitative PatternsProceedings of the 26th International Database Engineered Applications Symposium10.1145/3548785.3548808(25-33)Online publication date: 22-Aug-2022
  • Show More Cited By

Index Terms

  1. Mining weighted sequential patterns in incremental uncertain databases
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image Information Sciences: an International Journal
      Information Sciences: an International Journal  Volume 582, Issue C
      Jan 2022
      897 pages

      Publisher

      Elsevier Science Inc.

      United States

      Publication History

      Published: 01 January 2022

      Author Tags

      1. Data mining
      2. Sequential pattern mining
      3. Weighted sequential patterns
      4. Uncertain database
      5. Incremental database

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 09 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)An improved algorithm for frequent sequence pattern mining based on PrefixSpan-ComplexPrefixSpanProceedings of the 4th International Conference on Artificial Intelligence and Computer Engineering10.1145/3652628.3652636(48-52)Online publication date: 17-Nov-2023
      • (2023)Weighted Statistically Significant Pattern MiningCompanion Proceedings of the ACM Web Conference 202310.1145/3543873.3587586(1276-1285)Online publication date: 30-Apr-2023
      • (2022)Q-Eclat: Vertical Mining of Interesting Quantitative PatternsProceedings of the 26th International Database Engineered Applications Symposium10.1145/3548785.3548808(25-33)Online publication date: 22-Aug-2022
      • (2022)Novel next-group recommendation approach based on sequential market basket informationElectronic Commerce Research10.1007/s10660-022-09543-x23:4(2399-2418)Online publication date: 19-Mar-2022
      • (2021)A mathematical model for friend discovery from dynamic social graphsProceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining10.1145/3487351.3489473(569-576)Online publication date: 8-Nov-2021
      • (2021)Compressing and mining social network dataProceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining10.1145/3487351.3489472(545-552)Online publication date: 8-Nov-2021

      View Options

      View options

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media