Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

US-Rule: Discovering Utility-driven Sequential Rules

Published: 20 February 2023 Publication History

Abstract

Utility-driven mining is an important task in data science and has many applications in real life. High-utility sequential pattern mining (HUSPM) is one kind of utility-driven mining. It aims at discovering all sequential patterns with high utility. However, the existing algorithms of HUSPM can not provide a relatively accurate probability to deal with some scenarios for prediction or recommendation. High-utility sequential rule mining (HUSRM) is proposed to discover all sequential rules with high utility and high confidence. There is only one algorithm proposed for HUSRM, which is not efficient enough. In this article, we propose a faster algorithm called US-Rule, to efficiently mine high-utility sequential rules. It utilizes the rule estimated utility co-occurrence pruning strategy (REUCP) to avoid meaningless computations. Moreover, to improve its efficiency on dense and long sequence datasets, four tighter upper bounds (LEEU, REEU, LERSU, and RERSU) and corresponding pruning strategies (LEEUP, REEUP, LERSUP, and RERSUP) are designed. US-Rule also proposes the rule estimated utility recomputing pruning strategy (REURP) to deal with sparse datasets. Finally, a large number of experiments on different datasets compared to the state-of-the-art algorithm demonstrate that US-Rule can achieve better performance in terms of execution time, memory consumption, and scalability.

References

[1]
Rakesh Agrawal and Ramakrishnan Srikant. 1995. Mining sequential patterns. In Proceedings of the 11th International Conference on Data Engineering. IEEE, 3–14.
[2]
Chowdhury Farhan Ahmed, Syed Khairuzzaman Tanbeer, and Byeong-Soo Jeong. 2010. Mining high utility web access sequences in dynamic web log data. In Proceedings of the 11th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing. IEEE, 76–81.
[3]
Chowdhury Farhan Ahmed, Syed Khairuzzaman Tanbeer, and Byeong-Soo Jeong. 2010. A novel approach for mining high-utility sequential patterns in sequence databases. ETRI Journal 32, 5 (2010), 676–686.
[4]
Oznur Kirmemis Alkan and Pinar Karagoz. 2015. CRoM and HuspExt: Improving efficiency of high utility sequential pattern extraction. IEEE Transactions on Knowledge and Data Engineering 27, 10 (2015), 2645–2657.
[5]
Jay Ayres, Jason Flannick, Johannes Gehrke, and Tomi Yiu. 2002. Sequential pattern mining using a bitmap representation. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 429–435.
[6]
Andriy Drozdyuk, Scott Buffett, and Michael W Fleming. 2020. Incremental sequential rule mining with streaming input traces. In Proceedings of the 33rd Canadian Conference on Artificial Intelligence. 79–91.
[7]
Philippe Fournier-Viger, Usef Faghihi, Roger Nkambou, and Engelbert Mephu Nguifo. 2012. CMRules: Mining sequential rules common to several sequences. Knowledge-Based Systems 25, 1 (2012), 63–76.
[8]
Philippe Fournier-Viger, Antonio Gomariz, Manuel Campos, and Rincy Thomas. 2014. Fast vertical mining of sequential patterns using co-occurrence information. In Proceedings of the 18th Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 40–52.
[9]
Philippe Fournier-Viger, Ted Gueniche, and Vincent S. Tseng. 2012. Using partially-ordered sequential rules to generate more accurate sequence prediction. In Proceedings of the 8th International Conference on Advanced Data Mining and Applications. Springer, 431–442.
[10]
Philippe Fournier-Viger, Ted Gueniche, Souleymane Zida, and Vincent S. Tseng. 2014. ERMiner: Sequential rule mining using equivalence classes. In Proceedings of the 13th International Symposium on Intelligent Data Analysis. Springer, 108–119.
[11]
Philippe Fournier-Viger, Jerry Chun-Wei Lin, Rage Uday Kiran, Yun Sing Koh, and Rincy Thomas. 2017. A survey of sequential pattern mining. Data Science and Pattern Recognition 1, 1 (2017), 54–77.
[12]
Philippe Fournier-Viger and Vincent S. Tseng. 2013. TNS: Mining top-k non-redundant sequential rules. In Proceedings of the 28th Annual ACM Symposium on Applied Computing. 164–166.
[13]
Philippe Fournier-Viger, Cheng-Wei Wu, Vincent S. Tseng, Longbing Cao, and Roger Nkambou. 2015. Mining partially-ordered sequential rules common to multiple sequences. IEEE Transactions on Knowledge and Data Engineering 27, 8 (2015), 2203–2216.
[14]
Philippe Fournier-Viger, Cheng-Wei Wu, Souleymane Zida, and Vincent S. Tseng. 2014. FHM: Faster high-utility itemset mining using estimated utility co-occurrence pruning. In Proceedings of the 21st International Symposium on Methodologies for Intelligent Systems. Springer, 83–92.
[15]
Wensheng Gan, Jerry Chun-Wei Lin, Philippe Fournier-Viger, Han-Chieh Chao, Vincent S. Tseng, and Philip S. Yu. 2021. A survey of utility-oriented pattern mining. IEEE Transactions on Knowledge and Data Engineering 33, 4 (2021), 1306–1327.
[16]
Wensheng Gan, Jerry Chun-Wei Lin, Philippe Fournier-Viger, Han-Chieh Chao, and Philip S. Yu. 2019. A survey of parallel sequential pattern mining. ACM Transactions on Knowledge Discovery from Data 13, 3 (2019), 1–34.
[17]
Wensheng Gan, Jerry Chun-Wei Lin, Jiexiong Zhang, Han-Chieh Chao, Hamido Fujita, and Philip S. Yu. 2020. ProUM: Projection-based utility mining on sequence data. Information Sciences 513 (2020), 222–240.
[18]
Wensheng Gan, Jerry Chun-Wei Lin, Jiexiong Zhang, Philippe Fournier-Viger, Han-Chieh Chao, and Philip S Yu. 2021. Fast utility mining on sequence data. IEEE Transactions on Cybernetics 51, 2 (2021), 487–500.
[19]
Jiawei Han, Jian Pei, Behzad Mortazavi-Asl, Helen Pinto, Qiming Chen, Umeshwar Dayal, and Meichun Hsu. 2001. PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth. In Proceedings of the 17th International Conference on Data Engineering. IEEE Washington, DC, 215–224.
[20]
Jiawei Han, Jian Pei, Yiwen Yin, and Runying Mao. 2004. Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Mining and Knowledge Discovery 8, 1 (2004), 53–87.
[21]
Guo-Cheng Lan, Tzung-Pei Hong, Vincent S. Tseng, and Shyue-Liang Wang. 2014. Applying the maximum utility measure in high utility sequential pattern mining. Expert Systems with Applications 41, 11 (2014), 5071–5081.
[22]
Haoyuan Li, Yi Wang, Dong Zhang, Ming Zhang, and Edward Y. Chang. 2008. PFP: Parallel FP-growth for query recommendation. In Proceedings of the 2nd ACM Conference on Recommender Systems. 107–114.
[23]
Anisha Maske and Bela Joglekar. 2018. Survey on frequent item-set mining approaches in market basket analysis. In Proceedings of the 4th International Conference on Computing Communication Control and Automation. IEEE, 1–5.
[24]
Sinkon Nayak, Mahendra Kumar Gourisaria, Manjusha Pandey, and Siddharth Swarup Rautaray. 2019. Heart disease prediction using frequent itemset mining and classification technique.International Journal of Information Engineering & Electronic Business 11, 6 (2019), 9–15.
[25]
Alberto Segura-Delgado, Augusto Anguita-Ruiz, Rafael Alcalá, and Jesús Alcalá-Fdez. 2022. Mining high average-utility sequential rules to identify high-utility gene expression sequences in longitudinal human studies. Expert Systems with Applications 193 (2022), 116411.
[26]
Bai-En Shie, Hui-Fang Hsiao, Vincent S. Tseng, and Philip S. Yu. 2011. Mining high utility mobile sequential patterns in mobile commerce environments. In Proceeding of the 16th International Conference on Database Systems for Advanced Applications. Springer, 224–238.
[27]
Wei Song, Zihan Zhang, and Jinhong Li. 2016. A high utility itemset mining algorithm based on subsume index. Knowledge and Information Systems 49, 1 (2016), 315–340.
[28]
Ramakrishnan Srikant and Rakesh Agrawal. 1996. Mining sequential patterns: Generalizations and performance improvements. In Proceedings of the 5th International Conference on Extending Database Technology. Springer, 1–17.
[29]
Do Van Thanh and Truong Duc Phuong. 2020. Mining fuzzy common sequential rules with fuzzy time-interval in quantitative sequence databases. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 28, 6 (2020), 957–979.
[30]
Trang Van and Bac Le. 2021. Mining sequential rules with itemset constraints. Applied Intelligence 51, 10 (2021), 7208–7220.
[31]
Jianyong Wang, Jiawei Han, and Chun Li. 2007. Frequent closed sequence mining without candidate maintenance. IEEE Transactions on Knowledge and Data Engineering 19, 8 (2007), 1042–1056.
[32]
Jun-Zhe Wang, Jiun-Long Huang, and Yi-Cheng Chen. 2016. On efficiently mining high utility sequential patterns. Knowledge and Information Systems 49, 2 (2016), 597–627.
[33]
Jun-Zhe Wang, Zong-Hua Yang, and Jiun-Long Huang. 2014. An efficient algorithm for high utility sequential pattern mining. In Proceedings of the Frontier and Innovation in Future Computing and Communications. Springer, 49–56.
[34]
Youxi Wu, Rong Lei, Yan Li, Lei Guo, and Xindong Wu. 2021. HAOP-Miner: Self-adaptive high-average utility one-off sequential pattern mining. Expert Systems with Applications 184 (2021), 115449.
[35]
Youxi Wu, Lanfang Luo, Yan Li, Lei Guo, Philippe Fournier-Viger, Xingquan Zhu, and Xindong Wu. 2021. NTP-Miner: Nonoverlapping three-way sequential pattern mining. ACM Transactions on Knowledge Discovery from Data 16, 3 (2021), 1–21.
[36]
Junfu Yin, Zhigang Zheng, and Longbing Cao. 2012. USpan: An efficient algorithm for mining high utility sequential patterns. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 660–668.
[37]
Mohammed J. Zaki. 2001. SPADE: An efficient algorithm for mining frequent sequences. Machine Learning 42, 1–2 (2001), 31–60.
[38]
Chunkai Zhang, Quanjian Dai, Zilin Du, Wensheng Gan, Jian Weng, and Philip S. Yu. 2022. TUSQ: Targeted high-utility sequence querying. IEEE Transactions on Big Data.DOI:
[39]
Chunkai Zhang, Zilin Du, Wensheng Gan, and Philip S. Yu. 2021. TKUS: Mining top-\(k\) high utility sequential patterns. Information Sciences 570 (2021), 342–359.
[40]
Chunkai Zhang, Zilin Du, Yuting Yang, Wensheng Gan, and Philip S. Yu. 2021. On-shelf utility mining of sequence data. ACM Transactions on Knowledge Discovery from Data 16, 2 (2021), 1–31.
[41]
Mengjiao Zhang, Tiantian Xu, Zhao Li, Xiqing Han, and Xiangjun Dong. 2020. e-HUNSR: An efficient algorithm for mining high utility negative sequential rules. Symmetry 12, 8 (2020), 1211.
[42]
Souleymane Zida, Philippe Fournier-Viger, Cheng-Wei Wu, Jerry Chun-Wei Lin, and Vincent S. Tseng. 2015. Efficient mining of high-utility sequential rules. In Proceedings of the 11th International Workshop on Machine Learning and Data Mining in Pattern Recognition. Springer, 157–171.

Cited By

View all
  • (2024)Co-occurrence Order-preserving Pattern Mining with Keypoint Alignment for Time SeriesACM Transactions on Management Information Systems10.1145/365845015:2(1-27)Online publication date: 12-Jun-2024
  • (2024)Totally-ordered Sequential Rules for Utility MaximizationACM Transactions on Knowledge Discovery from Data10.1145/362845018:4(1-23)Online publication date: 12-Feb-2024
  • (2023)Targeted Querying of Closed High-Utility Itemsets2023 IEEE International Conference on Big Data (BigData)10.1109/BigData59044.2023.10386927(5967-5976)Online publication date: 15-Dec-2023
  • Show More Cited By

Index Terms

  1. US-Rule: Discovering Utility-driven Sequential Rules

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Knowledge Discovery from Data
      ACM Transactions on Knowledge Discovery from Data  Volume 17, Issue 1
      January 2023
      375 pages
      ISSN:1556-4681
      EISSN:1556-472X
      DOI:10.1145/3572846
      Issue’s Table of Contents

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 20 February 2023
      Online AM: 09 August 2022
      Accepted: 16 April 2022
      Received: 19 November 2021
      Published in TKDD Volume 17, Issue 1

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Data mining
      2. pattern mining
      3. sequential rule
      4. utility mining

      Qualifiers

      • Research-article

      Funding Sources

      • National Natural Science Foundation of China
      • Natural Science Foundation of Guangdong Province of China
      • Guangzhou Basic and Applied Basic Research Foundation

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)164
      • Downloads (Last 6 weeks)15
      Reflects downloads up to 21 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Co-occurrence Order-preserving Pattern Mining with Keypoint Alignment for Time SeriesACM Transactions on Management Information Systems10.1145/365845015:2(1-27)Online publication date: 12-Jun-2024
      • (2024)Totally-ordered Sequential Rules for Utility MaximizationACM Transactions on Knowledge Discovery from Data10.1145/362845018:4(1-23)Online publication date: 12-Feb-2024
      • (2023)Targeted Querying of Closed High-Utility Itemsets2023 IEEE International Conference on Big Data (BigData)10.1109/BigData59044.2023.10386927(5967-5976)Online publication date: 15-Dec-2023
      • (2023)USER: Towards High-Utility Sequential Rules with Repetitive Items2023 IEEE International Conference on Big Data (BigData)10.1109/BigData59044.2023.10386473(5977-5986)Online publication date: 15-Dec-2023

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      Full Text

      HTML Format

      View this article in HTML Format.

      HTML Format

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media