Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1871437.1871536acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Partial drift detection using a rule induction framework

Published: 26 October 2010 Publication History

Abstract

The major challenge in mining data streams is the issue of concept drift, the tendency of the underlying data generation process to change over time. In this paper, we propose a general rule learning framework that can efficiently handle concept-drifting data streams and maintain a highly accurate classification model. The main idea is to focus on partial drifts by allowing individual rules to monitor the stream and detect if there is a drift in the regions they cover. A rule quality measure then decides whether the affected rules are inconsistent with the concept drift. The model is accordingly updated to only include rules that are consistent with the newly arrived concept. A dynamically maintained set of instances deemed relevant to the most recent concept is also kept at memory. Learning a new concept from a larger set of instances reduces the variance of data distribution and allows for a more accurate, stable classification model. Our experiments show that this approach not only handles the drift efficiently, but it also can provide higher classification accuracy compared to other competitive approaches on a variety of real and synthetic data sets.

References

[1]
Concept drift. http://en.wikipedia.org/wiki/Concept_drift.
[2]
C. Aggarwal, J. Han, J. Wang, and P. Yu. On demand classification of data streams. In In Proc. of the 10th ACM SIGKDD, pages 503--508. ACM, 2004.
[3]
A. An and N. Cercone. ELEM2: A learning system for more accurate classifications. Lecture Notes in Computer Science, 1418:426--441, 1998.
[4]
A. An and N. Cercone. Rule quality measures for rule induction systems: Description and evaluation. Computational Intelligence, 17(3):409--424, 2001.
[5]
T. Dasu, S. Krishnan, S. Venkatasubramanian, and K. Yi. An information-theoretic approach to detecting changes in multi-dimensional data streams. In 38th Symposium on the Interface of Statistics, Computing Science, and Applications. Citeseer.
[6]
J. Gama, P. Medas, G. Castillo, and P. Rodrigues. Learning with drift detection. Advances in Artificial Intelligence - SBIA 2004, pages 286--295, 2004.
[7]
G. Hulten, L. Spencer, and P. Domingos. Mining time-changing data streams. In Proc. of the 7th ACM SIGKDD, pages 97--106. NY, USA, 2001.
[8]
K. Nishida and K. Yamauchi. Detecting concept drift using statistical testing. In Discovery Science, pages 264--269. Springer, 2007.
[9]
W. Street and Y. Kim. A streaming ensemble algorithm (SEA) for large-scale classification. In In Proc. of the 7th ACM SIGKDD. NY, USA, 2001.
[10]
A. Tsymbal. The problem of concept drift: definitions and related work. Computer Science Department, Trinity College Dublin, 2004.
[11]
P. Vorburger and A. Bernstein. Entropy-based concept shift detection. In ICDM'06, pages 1113--1118, 2006.
[12]
H. Wang, W. Fan, P. Yu, and J. Han. Mining concept-drifting data streams using ensemble classifiers. In ACM SIGKDD, pages 226--235, 2003.
[13]
L. Wang and A. An. Fast Counting with AV-Space for Efficient Rule Induction. Proc. of the SDM, 2007.
[14]
G. Widmer and M. Kubat. Learning in the presence of concept drift and hidden contexts. Machine learning, 23(1):69--101, 1996.
[15]
Y. Yang, X. Wu, and X. Zhu. Combining proactive and reactive predictions for data streams. In Proc. of the 11th ACM SIGKDD, page 715, 2005.

Cited By

View all
  • (2017)Mining Evolving Data Streams with Particle FiltersComputational Intelligence10.1111/coin.1207133:2(147-180)Online publication date: 1-May-2017
  • (2014)Self Tuning IDS for Changing EnvironmentProceedings of the 2014 International Conference on Computational Intelligence and Communication Networks10.1109/CICN.2014.227(1083-1087)Online publication date: 14-Nov-2014
  • (2013)Intrusion detection system using stream data mining and drift detection method2013 Fourth International Conference on Computing, Communications and Networking Technologies (ICCCNT)10.1109/ICCCNT.2013.6726628(1-5)Online publication date: Jul-2013

Index Terms

  1. Partial drift detection using a rule induction framework

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '10: Proceedings of the 19th ACM international conference on Information and knowledge management
    October 2010
    2036 pages
    ISBN:9781450300995
    DOI:10.1145/1871437
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 26 October 2010

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. classification
    2. concept drift detection
    3. rule induction

    Qualifiers

    • Research-article

    Conference

    CIKM '10

    Acceptance Rates

    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    CIKM '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 11 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2017)Mining Evolving Data Streams with Particle FiltersComputational Intelligence10.1111/coin.1207133:2(147-180)Online publication date: 1-May-2017
    • (2014)Self Tuning IDS for Changing EnvironmentProceedings of the 2014 International Conference on Computational Intelligence and Communication Networks10.1109/CICN.2014.227(1083-1087)Online publication date: 14-Nov-2014
    • (2013)Intrusion detection system using stream data mining and drift detection method2013 Fourth International Conference on Computing, Communications and Networking Technologies (ICCCNT)10.1109/ICCCNT.2013.6726628(1-5)Online publication date: Jul-2013

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media