Abstract
Structural graph clustering is one of the fundamental problems in managing and analyzing graph data. The structural clustering algorithm SCAN is successfully used in many applications because it obtains not only clusters but also hubs and outliers. However, the results of SCAN heavily depend on two sensitive parameters, \(\epsilon \) and \(\mu \), which may result in loss of accuracy and efficiency. In this paper, we propose a novel Density Peak-based Structural Clustering Algorithm for Networks (DPSCAN). Specifically, DPSCAN clusters vertices based on the structural similarity and the structural dependency between vertices and their neighbors, without tuning parameters. Through theoretical analysis, we prove that DPSCAN can detect meaningful clusters, hubs and outliers. In addition, to improve the efficiency of DPSCAN, we propose a new index structure named DP-Index, so that each vertex needs to be visited only once. Finally, we conduct comprehensive experimental studies on real and synthetic graphs, which demonstrate that our new approach outperforms the state-of-the-art approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bai, X., Yang, P., Shi, X.: An overlapping community detection algorithm based on density peaks. Neurocomputing 226, 7–15 (2017)
Chang, L., Li, W., Qin, L., Zhang, W., Yang, S.: pSCAN: fast and exact structural graph clustering. IEEE Trans. Knowl. Data Eng. 29(2), 387–401 (2017)
Clauset, A., Newman, M.E., Moore, C.: Finding community structure in very large networks. Phys. Rev. E 70(6), 066111 (2004)
Cordasco, G., Gargano, L.: Community detection via semi-synchronous label propagation algorithms. In: 2010 IEEE International Workshop on Business Applications of Social Network Analysis (BASNA), pp. 1–8. IEEE (2010)
Falkowski, T., Barth, A., Spiliopoulou, M.: Studying community dynamics with an incremental graph mining algorithm. In: AMCIS 2008 Proceedings, p. 29 (2008)
Girvan, M., Newman, M.E.: Community structure in social and biological networks. Proc. Natl. Acad. Sci. 99(12), 7821–7826 (2002)
Gong, S., Zhang, Y., Yu, G.: Clustering stream data by exploring the evolution of density mountain. Proc. VLDB Endowment 11(4), 393–405 (2017)
Huang, J., Sun, H., Han, J., Deng, H., Sun, Y., Liu, Y.: Shrink: a structural clustering algorithm for detecting hierarchical communities in networks. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 219–228. ACM (2010)
Lancichinetti, A., Fortunato, S., Radicchi, F.: Benchmark graphs for testing community detection algorithms. Phys. Rev. E 78(4), 046110 (2008)
Lim, S., Ryu, S., Kwon, S., Jung, K., Lee, J.G.: LinkSCAN*: overlapping community detection using the link-space transformation. In: 2014 IEEE 30th International Conference on Data Engineering (ICDE), pp. 292–303. IEEE (2014)
Newman, M.E.: Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 74(3), 036104 (2006)
Onizuka, M., Fujimori, T., Shiokawa, H.: Graph partitioning for distributed graph processing. Data Sci. Eng. 2(1), 94–105 (2017)
Parés, F., et al.: Fluid communities: a competitive, scalable and diverse community detection algorithm. In: Cherifi, C., Cherifi, H., Karsai, M., Musolesi, M. (eds.) COMPLEX NETWORKS 2017. SCI, vol. 689, pp. 229–240. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-72150-7_19
Raghavan, U.N., Albert, R., Kumara, S.: Near linear time algorithm to detect community structures in large-scale networks. Phys. Rev. E 76(3), 036106 (2007)
Rand, W.M.: Objective criteria for the evaluation of clustering methods. J. Am. Stat. Assoc. 66(336), 846–850 (1971)
Rodriguez, A., Laio, A.: Clustering by fast search and find of density peaks. Science 344(6191), 1492–1496 (2014)
Shao, J., Han, Z., Yang, Q., Zhou, T.: Community detection based on distance dynamics. In: Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1075–1084. ACM (2015)
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)
Shiokawa, H., Fujiwara, Y., Onizuka, M.: SCAN++: efficient algorithm for finding clusters, hubs and outliers on large-scale graphs. Proc. VLDB Endowment 8(11), 1178–1189 (2015)
Strehl, A., Ghosh, J.: Cluster ensembles-a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617 (2002)
Sun, H., Huang, J., Han, J., Deng, H., Zhao, P., Feng, B.: gSkeletonClu: density-based network clustering via structure-connected tree division or agglomeration. In: 2010 IEEE 10th International Conference on Data Mining (ICDM), pp. 481–490. IEEE (2010)
Wen, D., Qin, L., Zhang, Y., Chang, L., Lin, X.: Efficient structural graph clustering: an index-based approach. Proc. VLDB Endowment 11(3), 243–255 (2017)
Xu, X., Yuruk, N., Feng, Z., Schweiger, T.A.: SCAN: a structural clustering algorithm for networks. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 824–833. ACM (2007)
Zhou, K., Martin, A., Pan, Q., Liu, Z.: SELP: semi-supervised evidential label propagation algorithm for graph data clustering. Int. J. Approximate Reasoning 92, 139–154 (2018)
Zhou, K., Pan, Q., Martin, A.: Evidential community detection based on density peaks. In: Destercke, S., Denoeux, T., Cuzzolin, F., Martin, A. (eds.) BELIEF 2018. LNCS (LNAI), vol. 11069, pp. 269–277. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99383-6_33
Acknowledgements
This work is supported by the National Key R&D Program of China (2018YFB1003404), the National Nature Science Foundation of China (61872070, U1435216, U1811261 and 61602103) and the Fundamental Research Funds for the Central Universities (N171605001).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Wu, C., Gu, Y., Yu, G. (2019). DPSCAN: Structural Graph Clustering Based on Density Peaks. In: Li, G., Yang, J., Gama, J., Natwichai, J., Tong, Y. (eds) Database Systems for Advanced Applications. DASFAA 2019. Lecture Notes in Computer Science(), vol 11447. Springer, Cham. https://doi.org/10.1007/978-3-030-18579-4_37
Download citation
DOI: https://doi.org/10.1007/978-3-030-18579-4_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-18578-7
Online ISBN: 978-3-030-18579-4
eBook Packages: Computer ScienceComputer Science (R0)