Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

PSP-AMS: Progressive Mining of Sequential Patterns Across Multiple Streams

Published: 19 December 2018 Publication History

Abstract

Sequential pattern mining is used to find frequent data sequences over time. When sequential patterns are generated, the newly arriving patterns may not be identified as frequent sequential patterns due to the existence of old data and sequences. Progressive sequential pattern mining aims to find the most up-to-date sequential patterns given that obsolete items will be deleted from the sequences. When sequences come with multiple data streams, it is difficult to maintain and update the current sequential patterns. Even worse, when we consider the sequences across multiple streams, previous methods cannot efficiently compute the frequent sequential patterns. In this work, we propose an efficient algorithm PSP-AMS to address this problem. PSP-AMS uses a novel data structure PSP-MS-tree to insert new items, update current items, and delete obsolete items. By maintaining a PSP-MS-tree, PSP-AMS efficiently finds the frequent sequential patterns across multiple streams. The experimental results show that PSP-AMS significantly outperforms previous algorithms for mining of progressive sequential patterns across multiple streams on synthetic data as well as real data.

References

[1]
R. Agrawal and R. Srikant. 1995. Mining sequential patterns. In Proceedings of the 11th International Conference on Data Engineering. 3--14.
[2]
J. Ayres, J. Flannick, J. Gehrke, and T. Yiu. 2002. Sequential pattern mining using a bitmap representation. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 429--435.
[3]
L. Chang, T. Wang, D. Yang, and H. Luan. 2008. SeqStream: Mining closed sequential patterns over stream sliding windows. In Proceedings of the 8th IEEE International Conference on Data Mining. 83--92.
[4]
G. Chen, X. Wu, and X. Zhu. 2005. Sequential pattern mining in multiple streams. In Proceedings of the 5th IEEE International Conference on Data Mining. 585--588.
[5]
Yi-Cheng Chen, Yu-Lun Ko, Wen-Chih Peng, and Wang-Chien Lee. 2013. Mining appliance usage patterns in smart home environment. In Proceedings of Pacific-Asia Conference on Knowledge Discovery and Data Mining. 99--110.
[6]
Hong Cheng, Xifeng Yan, and Jiawei Han. 2004. IncSpan: Incremental mining of sequential patterns in large database. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’04). 527--532.
[7]
Bi-Ru Dai, Jen-Wei Huang, Mi-Yen Yeh, and Ming-Syan Chen. 2004. Clustering on demand for multiple data streams. In Proceedings of the 4th IEEE International Conference on Data Mining. 367--370.
[8]
Bi-Ru Dai, Jen-Wei Huang, Mi-Yen Yeh, and Ming-Syan Chen. 2006. Adaptive clustering for multiple evolving streams. IEEE Transactions on Knowledge and Data Engineering 18, 9 (Sept. 2006), 1166--1180.
[9]
C. I. Ezeife and M. Monwar. 2007. SSM: A frequent sequential data stream patterns miner. In Proceedings of the IEEE Symposium on Computational Intelligence and Data Mining (CIDM’07). 120--126.
[10]
Chin-Chuan Ho, Hua-Fu Li, Fang-Fei Kuo, and Suh-Yin Lee. 2006. Incremental mining of sequential patterns over a stream sliding window. In Proceedings of the 6th IEEE International Conference on Data Mining - Workshops (ICDMW’06). 677 --681.
[11]
Jen-Wei Huang, Su-Chen Lin, and Ming-Syan Chen. 2010. DPSP: Distributed progressive sequential pattern mining on the cloud. In Proceedings of the 14th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD-10).
[12]
Jen-Wei Huang, Chi-Yao Tseng, Jian-Chih Ou, and Ming-Syan Chen. 2008. A general model for sequential pattern mining with a progressive database. IEEE Transactions on Knowledge and Data Engineering 20 (2008), 1153--1167.
[13]
Bijay Prasad Jaysawal and Jen-Wei Huang. 2014. Mining frequent progressive usage patterns across multiple mobile broadcasting channels. In Trends and Applications in Knowledge Discovery and Data Mining. Lecture Notes in Computer Science, vol. 8643, Springer International Publishing, 149--155.
[14]
Bettahally N. Keshavamurthy, Mitesh Sharma, and Durga Toshniwal. 2010. Efficient support coupled frequent pattern mining over progressive databases. International Journal of Database Management System (IJDMS) 2, 2 (2010).
[15]
Ron Kohavi, Carla E. Brodley, Brian Frasca, Llew Mason, and Zijian Zheng. 2000. KDD-cup 2000 organizers’ report: Peeling the onion. ACM SIGKDD Explorations Newsletter 2, 2 (Dec. 2000), 86--93.
[16]
Guanling Lee, Yi-Chun Chen, and Kuo-Che Hung. 2013. PTree: Mining sequential patterns efficiently in multiple data streams environment. Journal of Information Science and Engineering 29, 6 (2013), 1151--1169.
[17]
S. C. Lee, E. Lee, W. Choi, and U. M. Kim. 2008. Extracting temporal behavior patterns of mobile user. In Proceedings of the 4th International Conference on Networked Computing and Advanced Information Management. Vol. 2, 455--462.
[18]
Vance Chiang-Chi Liao and Ming-Syan Chen. 2014. DFSP: A depth-first SPelling algorithm for sequential pattern mining of biological sequences. Knowledge and Information Systems 38, 3 (Mar. 2014), 623--639.
[19]
E. H. C. Lu and V. S. Tseng. 2009. Mining cluster-based mobile sequential patterns in location-based service environments. In Proceedings of the 10th International Conference on Mobile Data Management: Systems, Services and Middleware. 273--278.
[20]
A. Mhatre, M. Verma, and D. Toshniwal. 2009. Extracting sequential patterns from progressive databases: A weighted approach. In Proceedings of the International Conference on Signal Processing Systems. 788--792.
[21]
SonN. Nguyen, Xingzhi Sun, and MariaE. Orlowska. 2005. Improvements of IncSpan: Incremental mining of sequential patterns in large database. In Proceedings of the 9th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD-05).
[22]
S. Parthasarathy, M. J. Zaki, M. Ogihara, and S. Dwarkadas. 1999. Incremental and interactive sequence mining. In Proceedings of the 8th International Conference on Information and Knowledge Management (CIKM’99). 251--258.
[23]
J. Pei, J. Han, B. Mortazavi-Asl, J. Wang, H. Pinto, Q. Chen, U. Dayal, and M.-C. Hsu. 2004. Mining sequential patterns by pattern-growth: The PrefixSpan approach. IEEE Transactions on Knowledge and Data Engineering 16, 11 (Nov. 2004), 1424--1440.
[24]
Wen-Chih Peng and Zhung-Xun Liao. 2009. Mining sequential patterns across multiple sequence databases. Data 8 Knowledge Engineering 68, 10 (2009), 1014--1033.
[25]
C. Raissi, P. Poncelet, and M. Teisseire. 2006. SPEED: Mining maxirnal sequential patterns over data streams. In Proceedings of the 3rd International IEEE Conference Intelligent Systems. 546--552.
[26]
Heungmo Ryang and Unil Yun. 2016. High utility pattern mining over data streams with sliding window technique. Expert Systems with Applications 57 (2016), 214--231.
[27]
V. S. Tseng, E. Hsueh-Chan Lu, and Cheng-Hsien Huang. 2007. Mining temporal mobile sequential patterns in location-based service environments. In Proceedings of the International Conference on Parallel and Distributed Systems. Vol. 2, 1--8.
[28]
Pei-Hsin Wu, Wen-Chih Peng, and Ming-Syan Chen. 2001. Mining sequential alarm patterns in a telecommunication database. In Databases in Telecommunications II. Willem Jonker (Ed.), Springer, Berlin, 37--51.
[29]
Wenyan Wu and Le Gruenwald. 2010. Research issues in mining multiple data streams. In Proceedings of the 1st International Workshop on Novel Data Stream Pattern Mining Techniques (StreamKDD’10). 56--60.
[30]
C. Xu, Y. Chen, and R. Bie. 2009. Sequential pattern mining in data streams using the weighted sliding window model. In Proceedings of the 15th International Conference on Parallel and Distributed Systems. 886--890.
[31]
S.-Y. Yang, C.-M. Chao, P.-Z. Chen, and C.-H. Sun. 2011. Incremental mining of across-streams sequential patterns in multiple data streams. Journal of Computers 6, 3 (2011), 449--457.
[32]
M. Y. Yeh, B. R. Dai, and M. S. Chen. 2007. Clustering over multiple evolving streams by events and correlations. IEEE Transactions on Knowledge and Data Engineering 19, 10 (2007), 1349--1362.
[33]
C. H. Yun and M. S. Chen. 2007. Mining mobile sequential patterns in a mobile commerce environment. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 37, 2 (Mar. 2007), 278--295.
[34]
Unil Yun, Donggyu Kim, Heungmo Ryang, Gangin Lee, and Kyung-Min Lee. 2016. Mining recent high average utility patterns based on sliding window from stream data. Journal of Intelligent 8 Fuzzy Systems 30, 6 (2016), 3605--3617.
[35]
Unil Yun and Gangin Lee. 2016. Sliding window based weighted erasable stream pattern mining for stream data applications. Future Generation Computer Systems 59 (2016), 1--20.
[36]
U. Yun, G. Lee, and E. Yoon. 2017. Efficient high utility pattern mining for establishing manufacturing plans with sliding window control. IEEE Transactions on Industrial Electronics 64, 9 (Sept. 2017), 7239--7249.
[37]
Unil Yun, Gwangbum Pyun, and Eunchul Yoon. 2015. Efficient mining of robust closed weighted sequential patterns without information loss. International Journal on Artificial Intelligence Tools 24, 1 (2015), Article 1550007.
[38]
Mohammed J. Zaki. 1998. Efficient enumeration of frequent sequences. In Proceedings of the 7th International Conference on Information and Knowledge Management (CIKM’98). ACM, New York, NY, 68--75.

Cited By

View all
  • (2024)MFS-SubSC: an efficient algorithm for mining frequent sequences with sub-sequence constraintKnowledge and Information Systems10.1007/s10115-024-02148-w66:10(6151-6186)Online publication date: 11-Jun-2024
  • (2023)An Augmented Learning Approach for Multiple Data Streams Under Concept DriftAI 2023: Advances in Artificial Intelligence10.1007/978-981-99-8388-9_32(391-402)Online publication date: 28-Nov-2023
  • (2022)Design and Implementation of an Online Learning Behavior Evaluation System Based on Data MiningAdvances in Multimedia10.1155/2022/42599132022Online publication date: 1-Jan-2022
  • Show More Cited By

Index Terms

  1. PSP-AMS: Progressive Mining of Sequential Patterns Across Multiple Streams

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Knowledge Discovery from Data
    ACM Transactions on Knowledge Discovery from Data  Volume 13, Issue 1
    February 2019
    340 pages
    ISSN:1556-4681
    EISSN:1556-472X
    DOI:10.1145/3301280
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 19 December 2018
    Accepted: 01 September 2018
    Revised: 01 July 2018
    Received: 01 February 2018
    Published in TKDD Volume 13, Issue 1

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Progressive mining
    2. across data streams
    3. across-streams sequential patterns
    4. multiple data streams
    5. sequential patterns

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)15
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 03 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)MFS-SubSC: an efficient algorithm for mining frequent sequences with sub-sequence constraintKnowledge and Information Systems10.1007/s10115-024-02148-w66:10(6151-6186)Online publication date: 11-Jun-2024
    • (2023)An Augmented Learning Approach for Multiple Data Streams Under Concept DriftAI 2023: Advances in Artificial Intelligence10.1007/978-981-99-8388-9_32(391-402)Online publication date: 28-Nov-2023
    • (2022)Design and Implementation of an Online Learning Behavior Evaluation System Based on Data MiningAdvances in Multimedia10.1155/2022/42599132022Online publication date: 1-Jan-2022
    • (2022)Negative pattern discovery with individual supportKnowledge-Based Systems10.1016/j.knosys.2022.109194251(109194)Online publication date: Sep-2022
    • (2022)Mining sequential patterns with flexible constraints from MOOC dataApplied Intelligence10.1007/s10489-021-03122-752:14(16458-16474)Online publication date: 23-Mar-2022
    • (2022)Incremental Mining of Frequent Serial Episodes Considering Multiple OccurrencesComputational Science – ICCS 202210.1007/978-3-031-08751-6_33(460-472)Online publication date: 21-Jun-2022
    • (2021)Using Clustering Analysis and Association Rule Technology in Cross-MarketingComplexity10.1155/2021/99798742021Online publication date: 1-Jan-2021
    • (2021)NTP-Miner: Nonoverlapping Three-Way Sequential Pattern MiningACM Transactions on Knowledge Discovery from Data10.1145/348024516:3(1-21)Online publication date: 22-Oct-2021
    • (2021)An Big Data Analysis Approach Based on Frequent Change Structure Mining2021 6th International Conference on Smart Grid and Electrical Automation (ICSGEA)10.1109/ICSGEA53208.2021.00109(455-459)Online publication date: May-2021

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media