Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/1764441.1764526guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Approximate trace of grid-based clusters over high dimensional data streams

Published: 22 May 2007 Publication History

Abstract

Clustering in a large data set of high dimensionality has always been a serious challenge in the field of data mining. A good clustering method should provide flexible scalability to the number of dimensions as well as the size of a data set. We have proposed a grid-based clustering method called a hybrid-partition method for an on-line data stream. However, as the dimensionality of a data stream is increased, the time and space complexity of this method is increased rapidly. In this paper, a sibling list is proposed to find the clusters of a multi-dimensional data space based on the one-dimensional clusters of each dimension. Although the accuracy of identified multi-dimensional clusters may be less accurate, this one-dimensional approach can provide better scalability to the number of dimensions. This is because the one-dimensional approach requires much less memory usage than the multi-dimensional approach does. Therefore, the confined space of main memory can be more effectively utilized by the one-dimensional approach.

References

[1]
M. Garofalakis, J. Gehrke and R. Rastogi. Querying and mining data streams: you only get one look. In the tutorial notes of the 28th Int'l Conference on Very Large Databases, Hong Kong, China, Aug. 2002.
[2]
Mohamed Medhat Gaber, Arkady B. Zaslavsky, Shonali Krishnaswamy: Mining data streams: a review. SIGMOD Record 34(2), page 18-26, 2005.
[3]
Joong Hyuk Chang, Won Suk Lee. Finding frequent itemsets over online data streams. Information & Software Technology 48(7), page 606-618, 2006.
[4]
Liadan O'Callaghan, Nina Mishra, Adam Meyerson, Sudipto Guha, and Rajeev Motwani. STREAM-data algorithms for high-quality clustering. In Proc. of IEEE International Conference on Data Engineering, March 2002.
[5]
R. O. Duda and P. E. Hart. Pattern Classification and Scene Analysis. Wiley, 1972.
[6]
Charu C. Aggarwal, Jiawei Han, Jianyong Wang, Philip S. Yu. A Framework for Clustering Evolving Data Streams. In Proc. VLDB 29th, Berlin, 2.
[7]
Nam Hun Park and Won Suk Lee. A statistical Grid-based Clustering over data streams. ACM SIGMOD Record, Volume 33, Issue 1, Page 32-37, 2004.
[8]
Donald E. Knuth, The Art of Computer Programming, Addison-Wesley, volumes 1, 2 and 3, 3rd edition, 1998.
[9]
Cheng, C., Fu, A., and Zhang, Y. Entropy-based subspace clustering for mining numerical data. KDD-99, 84-93, San Diego, August 1999.
  1. Approximate trace of grid-based clusters over high dimensional data streams

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    PAKDD'07: Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
    May 2007
    1160 pages
    ISBN:9783540717003
    • Editors:
    • Zhi-Hua Zhou,
    • Hang Li,
    • Qiang Yang

    Sponsors

    • NSF of China: National Natural Science Foundation of China
    • Microsoft adCenter Labs
    • Microsoft Research Asia
    • Salford Systems
    • NEC: NEC Labs China

    In-Cooperation

    • Singapore Institute of Statistics
    • Nanjing University of Aeronautics and Astronautics
    • The Japanese Society for Artificial Intelligence

    Publisher

    Springer-Verlag

    Berlin, Heidelberg

    Publication History

    Published: 22 May 2007

    Author Tags

    1. clustering
    2. data mining
    3. data stream
    4. grid-based clustering

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 0
      Total Downloads
    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 17 Jan 2025

    Other Metrics

    Citations

    View Options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media