Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1281192.1281238acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
Article

A fast algorithm for finding frequent episodes in event streams

Published: 12 August 2007 Publication History

Abstract

Frequent episode discovery is a popular framework for mining data available as a long sequence of events. An episode is essentially a short ordered sequence of event types and the frequency of an episode is some suitable measure of how often the episode occurs in the data sequence. Recently,we proposed a new frequency measure for episodes based on the notion of non-overlapped occurrences of episodes in the event sequence, and showed that, such a definition, in addition to yielding computationally efficient algorithms, has some important theoretical properties in connecting frequent episode discovery with HMM learning. This paper presents some new algorithms for frequent episode discovery under this non-overlapped occurrences-based frequency definition. The algorithms presented here are better (by a factor of N, where N denotes the size of episodes being discovered) in terms of both time and space complexities when compared to existing methods for frequent episode discovery. We show through some simulation experiments, that our algorithms are very efficient. The new algorithms presented here have arguably the least possible orders of spaceand time complexities for the task of frequent episode discovery.

References

[1]
M. J. Atallah, R. Gwadera, and W. Szpankowski. Detection of significant sets of episodes in event sequences. In Proceedings of the 4th IEEE International Conference on Data Mining (ICDM 2004), pages 3--10, Brighton, UK, 01-04 November 2004.
[2]
G. Casas-Garriga. Discovering unbounded episodes in sequential data. In Proceedings of the 7th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD'03), pages 83--94, Cavtat-Dubvrovnik, Croatia, 2003.
[3]
R. Gwadera, M. J. Atallah, and W. Szpankowski. Reliable detection of episodes in event sequences. In Proceedings of the 3rd IEEE International Conference on Data Mining (ICDM 2003), pages 67--74, 19-23 November 2003.
[4]
R. Gwadera, M. J. Atallah, and W. Szpankowski. Markov models for identification of significant episodes. In Proceedings of the 2005 SIAM International Conference on Data Mining (SDM-05), Newport Beach, California, April 2005.
[5]
S. Laxman. Discovering frequent episodes in event streams: Fast algorithms, connections with HMMs and generalizations. PhD thesis, Dept. of Electrical Engineering, Indian Institute of Science, Bangalore, India, Mar. 2006.
[6]
S. Laxman, P. S. Sastry, and K. P. Unnikrishnan. Discovering frequent episodes and learning Hidden Markov Models: A formal connection. IEEE Transactions on Knowledge and Data Engineering, 17(11):1505--1517, Nov. 2005.
[7]
S. Laxman, P. S. Sastry, and K. P. Unnikrishnan. Discovering frequent generalized episodes when events persist for different durations. To appear in IEEE Transactions on Knowledge and Data Engineering, 2007.
[8]
H. Mannila, H. Toivonen, and A. I. Verkamo. Discovery of frequent episodes in event sequences. Data Mining and Knowledge Discovery, 1(3):259--289, 1997.
[9]
N. Meger and C. Rigotti. Constraint-based mining of episode rules and optimal window sizes. In Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD'04), Pisa, Italy, Sept. 2004.
[10]
M. Regnier and W. Szpankowski. On pattern frequency occurrences in a Markovian sequence. Algorithmica, 22:631--649, 1998.

Cited By

View all
  • (2024)Breadth-First Search Approach for Mining Serial Episodes with Simultaneous EventsProceedings of the 7th Joint International Conference on Data Science & Management of Data (11th ACM IKDD CODS and 29th COMAD)10.1145/3632410.3632445(36-44)Online publication date: 4-Jan-2024
  • (2023)Discovering High Utility Episodes in SequencesIEEE Transactions on Artificial Intelligence10.1109/TAI.2022.32239654:3(473-486)Online publication date: Jun-2023
  • (2023)Discovering frequent parallel episodes in complex event sequences by counting distinct occurrencesApplied Intelligence10.1007/s10489-023-05187-y54:1(701-721)Online publication date: 18-Dec-2023
  • Show More Cited By

Index Terms

  1. A fast algorithm for finding frequent episodes in event streams

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '07: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
    August 2007
    1080 pages
    ISBN:9781595936097
    DOI:10.1145/1281192
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 August 2007

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. event streams
    2. frequent episodes
    3. non-overlapped occurrences
    4. temporal data mining

    Qualifiers

    • Article

    Conference

    KDD07

    Acceptance Rates

    KDD '07 Paper Acceptance Rate 111 of 573 submissions, 19%;
    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)4
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 09 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Breadth-First Search Approach for Mining Serial Episodes with Simultaneous EventsProceedings of the 7th Joint International Conference on Data Science & Management of Data (11th ACM IKDD CODS and 29th COMAD)10.1145/3632410.3632445(36-44)Online publication date: 4-Jan-2024
    • (2023)Discovering High Utility Episodes in SequencesIEEE Transactions on Artificial Intelligence10.1109/TAI.2022.32239654:3(473-486)Online publication date: Jun-2023
    • (2023)Discovering frequent parallel episodes in complex event sequences by counting distinct occurrencesApplied Intelligence10.1007/s10489-023-05187-y54:1(701-721)Online publication date: 18-Dec-2023
    • (2023)A survey of episode miningWIREs Data Mining and Knowledge Discovery10.1002/widm.152414:2Online publication date: 28-Nov-2023
    • (2022)Differentially private frequent episode mining over event streamsEngineering Applications of Artificial Intelligence10.1016/j.engappai.2022.104681110:COnline publication date: 1-Apr-2022
    • (2022)PETSC: pattern-based embedding for time series classificationData Mining and Knowledge Discovery10.1007/s10618-022-00822-736:3(1015-1061)Online publication date: 24-Mar-2022
    • (2022)Omen: discovering sequential patterns with reliable prediction delaysKnowledge and Information Systems10.1007/s10115-022-01660-164:4(1013-1045)Online publication date: 5-Mar-2022
    • (2021)A Fine-grained Approach for Anomaly Detection in File System Accesses with Enhanced Temporal User ProfilesIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2019.2954507(1-1)Online publication date: 2021
    • (2021)Mining Episode Rules from Event Sequences Under Non-overlapping FrequencyAdvances and Trends in Artificial Intelligence. Artificial Intelligence Practices10.1007/978-3-030-79457-6_7(73-85)Online publication date: 19-Jul-2021
    • (2020)Just Wait for it... Mining Sequential Patterns with Reliable Prediction Delays2020 IEEE International Conference on Data Mining (ICDM)10.1109/ICDM50108.2020.00017(82-91)Online publication date: Nov-2020
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media