Merging DBSCAN and Density Peak for Robust Clustering

Hou, Jian; Lv, Chengcong; Zhang, Aihua; E, Xu

doi:10.1007/978-3-030-30490-4_48

Jian Hou ORCID: orcid.org/0000-0001-6515-1430¹²,
Chengcong Lv¹²,
Aihua Zhang¹² &
…
Xu E¹³

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11730))

Included in the following conference series:

International Conference on Artificial Neural Networks

5056 Accesses
5 Citations

Abstract

In data clustering, density based algorithms are well known for the ability of detecting clusters of arbitrary shapes. DBSCAN is a widely used density based clustering approach, and the recently proposed density peak algorithm has shown significant potential in experiments. However, the DBSCAN algorithm may misclassify border data points of small density as noises and does not work well with large density variance across clusters, and the density peak algorithm has a large dependence on the detected cluster centers. To circumvent these problems, we make a study of these two algorithms and find that they have some complementary properties. We then propose to combine these two algorithms to overcome their problems. Specifically, we use the DP algorithm to detect cluster centers and then determine the parameters for DBSCAN adaptively. After DBSCAN clustering, we further use the DP algorithm to include border data points of small density into clusters. By combining the complementary properties of these two algorithms, we manage to relieve the problems of DBSCAN and avoid the drawbacks of the density peak algorithm in the meanwhile. Our algorithm is tested with synthetic and real datasets, and is demonstrated to perform better than DBSCAN and density peak algorithms, as well as some other clustering algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Clustering of Multiple Density Peaks

CB-DBSCAN: A Novel Clustering Algorithm for Adjacent Clusters with Different Densities

A novel density peaks clustering algorithm based on K nearest neighbors with adaptive merging strategy

Article Open access 03 August 2021

References

Achtert, E., Bohm, C., Kroger, P.: Deli-clu: boosting robustness, completeness, usability, and efficiency of hierarchical clustering by a closest pair ranking. In: International Conference on Knowledge Discovery and Data Mining, pp. 119–128 (2006)
Chapter Google Scholar
Ankerst, M., Breunig, M.M., Kriegel, H.P., Sander, J.: Optics: Ordering points to identify the clustering structure. In: ACM SIGMOD International Conference on Management of Data, pp. 49–60 (1999). https://doi.org/10.1145/304182.304187
Bai, L., Cheng, X., Liang, J., Shen, H., Guo, Y.: Fast density clustering strategies based on the k-means algorithm. Pattern Recogn. 71, 375–386 (2017). https://doi.org/10.1016/j.patcog.2017.06.023
Article Google Scholar
Bansal, N., Blum, A., Chawla, S.: Correlation clustering. Mach. Learn. 56(1–3), 89–113 (2004). https://doi.org/10.1023/B:MACH.0000033116.57574.95
Article MathSciNet MATH Google Scholar
Brendan, J.F., Delbert, D.: Clustering by passing messages between data points. Science 315, 972–976 (2007). https://doi.org/10.1126/science.1136800
Article MathSciNet MATH Google Scholar
Chang, H., Yeung, D.Y.: Robust path-based spectral clustering. Pattern Recogn. 41(1), 191–203 (2008). https://doi.org/10.1016/j.patcog.2007.04.010
Article MATH Google Scholar
Chen, Y., Tang, S., Bouguil, N., Wang, C., Du, J., Li, H.: A fast clustering algorithm based on pruning unnecessary distance computations in dbscan for high-dimensional data. Pattern Recogn. 83, 375–387 (2018). https://doi.org/10.1016/j.patcog.2018.05.030
Article Google Scholar
Cheng, Y.: Mean shift, mode seeking, and clustering. IEEE Trans. Pattern Anal. Mach. Intell. 17(8), 790–799 (1995). https://doi.org/10.1109/34.400568
Article Google Scholar
Comaniciu, D., Peter, M.: Mean shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24(5), 603–619 (2002). https://doi.org/10.1109/34.1000236
Article Google Scholar
Daszykowski, M., Walczak, B., Massart, D.L.: Looking for natural patterns in data: Part 1. density-based approach. Chemometr. Intell. Lab. Syst. 56(2), 83–92 (2001). https://doi.org/10.1016/s0169-7439(01)00111-3
Article Google Scholar
Dong, S., Liu, J., Liu, Y., Zeng, L., Xu, C., Zhou, T.: Clustering based on grid and local density with priority-based expansion for multi-density data. Inf. Sci. 468, 103–116 (2018). https://doi.org/10.1016/j.ins.2018.08.018
Article Google Scholar
Ester, M., Kriegel, H.P., Sander, J., Xu, X.W.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: International Conference on Knowledge Discovery and Data Mining, pp. 226–231 (1996)
Google Scholar
Ferone, A., Maratea, A.: Integrating rough set principles in the graded possibilistic clustering. Inf. Sci. 477, 148–160 (2019). https://doi.org/10.1016/j.ins.2018.10.038
Article MathSciNet Google Scholar
Fu, L., Medico, E.: Flame, a novel fuzzy clustering method for the analysis of dna microarray data. BMC Bioinform. 8(1), 1–17 (2007). https://doi.org/10.1186/1471-2105-8-3
Article Google Scholar
Gao, H., Nie, F., Li, X., Huang, H.: Multi-view subspace clustering. In: IEEE International Conference on Computer Vision, pp. 4238–4246 (2015). https://doi.org/10.1109/ICCV.2015.482
Gionis, A., Mannila, H., Tsaparas, P.: Clustering aggregation. ACM Trans. Knowl. Discov. Data 1(1), 1–30 (2007). https://doi.org/10.1145/1217299.1217303
Article Google Scholar
Hinnerberg, A., Keim, D.: An efficient approach to clustering large multimedia databases with noise. In: International Conference on Knowledge Discovery and Data Mining, pp. 58–65 (1998)
Google Scholar
Hou, J., Gao, H., Li, X.: DSets-DBSCAN: a parameter-free clustering algorithm. IEEE Trans. Image Process. 25(7), 3182–3193 (2016). https://doi.org/10.1109/TIP.2016.2559803
Article MathSciNet MATH Google Scholar
Hou, J., Gao, H., Li, X.: Feature combination via clustering. IEEE Trans. Neural Networks Learn. Syst. 29(4), 896–907 (2018). https://doi.org/10.1109/TNNLS.2016.2645883
Article Google Scholar
Hou, J., Liu, W.: Clustering based on dominant set and cluster expansion. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 76–87 (2017)
Chapter Google Scholar
Hou, J., Liu, W.: Parameter independent clustering based on dominant sets and cluster merging. Inf. Sci. 405, 1–17 (2017). https://doi.org/10.1016/j.ins.2017.04.006
Article Google Scholar
Hou, J., Liu, W.: A parameter independent clustering framework. IEEE Trans. Industr. Inf. 13(4), 1825–1832 (2017). https://doi.org/10.1109/TII.2017.2656909
Article Google Scholar
Jain, A.K., Law, M.H.C.: Data clustering: a user’s dilemma. In: International Conference on Pattern Recognition and Machine Intelligence, pp. 1–10 (2005)
Google Scholar
Kumar, K.M., Reddy, A.R.M.: A fast dbscan clustering algorithm by accelerating neighbor searching using groups method. Pattern Recogn. 58, 39–48 (2016). https://doi.org/10.1016/j.patcog.2016.03.008
Article Google Scholar
Li, C., You, C., Vidal, R.: Structured sparse subspace clustering: a joint affinity learning and subspace clustering framework. IEEE Trans. Image Process. 26(6), 2988–3001 (2017). https://doi.org/10.1109/TIP.2017.2691557
Article MathSciNet MATH Google Scholar
Li, J., Wang, C., Li, P., Lai, J.: Discriminative metric learning for multi-view graph partitioning. Pattern Recogn. 75, 199–213 (2018). https://doi.org/10.1016/j.patcog.2017.06.012
Article Google Scholar
Li, Q., Liu, W., Li, L.: Affinity learning via a diffusion process for subspace clustering. Pattern Recogn. 84, 39–50 (2018). https://doi.org/10.1016/j.patcog.2018.07.002
Article Google Scholar
Liu, R., Wang, H., Yu, X.: Shared-nearest-neighbor-based clustering by fast search and find of density peaks. Inf. Sci. 450, 200–226 (2018). https://doi.org/10.1016/j.ins.2018.03.031
Article MathSciNet Google Scholar
von Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007). https://doi.org/10.1007/s11222-007-9033-z
Article MathSciNet Google Scholar
Mequanint, E.Z., Pelillo, M.: Interactive image segmentation using constrained dominant sets. In: European Conference on Computer Vision, pp. 278–294 (2016)
Google Scholar
Ng, A., Jordan, M., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Advances in Neural Information Processing Systems, pp. 849–856 (2002)
Google Scholar
Pavan, M., Pelillo, M.: Dominant sets and pairwise clustering. IEEE Trans. Pattern Anal. Mach. Intell. 29(1), 167–172 (2007). https://doi.org/10.1109/TPAMI.2007.250608
Article Google Scholar
Qiu, T., Li, C., Li, Y.: D-NND: a hierarchical density clustering method via nearest neighbor descent. In: International Conference on Pattern Recognition, pp. 1414–1419 (2018). https://doi.org/10.1109/ICPR.2018.8545142
Rodriguez, A., Laio, A.: Clustering by fast search and find of density peaks. Science 344, 1492–1496 (2014). https://doi.org/10.1126/science.1242072
Article Google Scholar
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 167–172 (2000). https://doi.org/10.1109/34.868688
Article Google Scholar
Tripodi, R., Pelillo, M.: A game-theoretic approach to word sense disambiguation. Comput. Linguist. 43(1), 31–70 (2017)
Article MathSciNet Google Scholar
Vascon, S., Mequanint, E.Z., Cristani, M., Hung, H., Pelillo, M., Murino, V.: Detecting conversational groups in images and sequences: a robust game-theoretic approach. Comput. Vis. Image Underst. 143, 11–24 (2016). https://doi.org/10.1016/j.cviu.2015.09.012
Article Google Scholar
Veenman, C.J., Reinders, M., Backer, E.: A maximum variance cluster algorithm. IEEE Trans. Pattern Anal. Mach. Intell. 24(9), 1273–1280 (2002). https://doi.org/A maximum variance cluster algorithm
Article Google Scholar
Yu, J., Chaomurilige, C., Yang, M.S.: On convergence and parameter selection of the EM and DA-EM algorithms for gaussian mixtures. Pattern Recogn. 77, 188–203 (2018). https://doi.org/10.1016/j.patcog.2017.12.014
Article Google Scholar
Zahn, C.T.: Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Trans. Comput. 20(1), 68–86 (1971). https://doi.org/10.1109/t-c.1971.223083
Article MATH Google Scholar
Zhang, H., Ren, P.: Game theoretic hypergraph matching for multi-source image correspondences. Pattern Recogn. Lett. (2016). https://doi.org/10.1016/j.patrec.2016.07.011
Article Google Scholar
Zhong, C., Miao, D., Fránti, P.: Minimum spanning tree based split-and-merge: a hierarchical clustering method. Inf. Sci. 181(16), 3397–3410 (2011). https://doi.org/10.1016/j.ins.2011.04.013
Article Google Scholar
Zhu, X., Loy, C.C., Gong, S.: Constructing robust affinity graphs for spectral clustering. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 1450–1457 (2014). https://doi.org/10.1109/cvpr.2014.188

Download references

Acknowledgement

This work is supported in part by the National Natural Science Foundation of China under Grant No. 61473045, and by the Natural Science Foundation of Liaoning Province under Grant No. 20170540013.

Author information

Authors and Affiliations

College of Engineering, Bohai University, Jinzhou, 121013, China
Jian Hou, Chengcong Lv & Aihua Zhang
College of Information Sciences, Bohai University, Jinzhou, 121013, China
Xu E

Authors

Jian Hou
View author publications
You can also search for this author in PubMed Google Scholar
Chengcong Lv
View author publications
You can also search for this author in PubMed Google Scholar
Aihua Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xu E
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jian Hou .

Editor information

Editors and Affiliations

Helmholtz Zentrum München - Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH), Neuherberg, Germany
Igor V. Tetko
Institute of Computer Science, Czech Academy of Sciences, Prague 8, Czech Republic
Věra Kůrková
Helmholtz Zentrum München - Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH), Neuherberg, Germany
Pavel Karpov
Helmholtz Zentrum München - Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH), Neuherberg, Germany
Fabian Theis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hou, J., Lv, C., Zhang, A., E, X. (2019). Merging DBSCAN and Density Peak for Robust Clustering. In: Tetko, I., Kůrková, V., Karpov, P., Theis, F. (eds) Artificial Neural Networks and Machine Learning – ICANN 2019: Text and Time Series. ICANN 2019. Lecture Notes in Computer Science(), vol 11730. Springer, Cham. https://doi.org/10.1007/978-3-030-30490-4_48

Download citation

DOI: https://doi.org/10.1007/978-3-030-30490-4_48
Published: 09 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30489-8
Online ISBN: 978-3-030-30490-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Merging DBSCAN and Density Peak for Robust Clustering

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Clustering of Multiple Density Peaks

CB-DBSCAN: A Novel Clustering Algorithm for Adjacent Clusters with Different Densities

A novel density peaks clustering algorithm based on K nearest neighbors with adaptive merging strategy

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Merging DBSCAN and Density Peak for Robust Clustering

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Clustering of Multiple Density Peaks

CB-DBSCAN: A Novel Clustering Algorithm for Adjacent Clusters with Different Densities

A novel density peaks clustering algorithm based on K nearest neighbors with adaptive merging strategy

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation