Abstract
We present a novel algorithm called DBSC, which finds subspace clusters in numerical datasets based on the concept of “dependency”. This algorithm uses a depth-first search strategy to find out the maximal subspaces: a new dimension is added to current k-subspace and its validity as a (k+1)-subspace is evaluated. The clusters within those maximal subspaces are mined in a similar fashion as maximal subspace mining does. With the experiments on synthetic and real datasets, our algorithm is shown to be both effective and efficient for high dimensional datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agrawal, R., Gehrke, J., Gunopulos, D., et al.: Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications. In: Proceedings of the ACM SIGMOD, pp. 94–105. ACM Press, New York (1998)
Goil, S., Nagesh, H., Choudhary, A.: MAFIA: Efficient and Scalable Subspace Clustering for very Large Datasets.Technical Report CPDC-TR-9906-010, Northwestern University (June 1999)
Sequeira, K., Zaki, M.: SCHISM: A New Approach for Interesting Subspace Mining. In: The Proceedings of the Fourth IEEE Conference On Data Mining, pp. 186–193 (2004)
Zaki, M.J., Peters, M., et al.: CLICKS: an Effective Algorithm for Mining Subspace Clusters in Categorical Datasets. In: Proceeding of the eleventh ACM SIGKDD, pp. 736–742 (2005)
Cheng, C.H., Fu, A.W., Zhang, Y.: Entropy-based Subspace Clustering for Mining Numerical Data. In: Proceedings of ACM SIGKDD, pp. 84–93 (1999)
Mundhenk, T.N., Navalpakkam, V., Makaliwe, H., Vasudevan, S., Itti, L.: Biologically Inspired Feature Based Categorization of Objects. In: HVEI 2004. Proc. SPIE Human Vision and Electronic Imaging IX, San Jose, CA, Bellingham, vol. 5292 (January 2004)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, X., Li, C. (2007). DBSC: A Dependency-Based Subspace Clustering Algorithm for High Dimensional Numerical Datasets. In: Orgun, M.A., Thornton, J. (eds) AI 2007: Advances in Artificial Intelligence. AI 2007. Lecture Notes in Computer Science(), vol 4830. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76928-6_101
Download citation
DOI: https://doi.org/10.1007/978-3-540-76928-6_101
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-76926-2
Online ISBN: 978-3-540-76928-6
eBook Packages: Computer ScienceComputer Science (R0)