Abstract
Graph clustering is a fundamental technique in data analysis with a vast number of applications in computer science and statistics. In theoretical computer science, the problem of graph clustering has received significant research attention over the past two decades, which has led to pivotal algorithmic breakthroughs. However, the design of most graph clustering algorithms is based on complicated techniques from computational optimisation, which are not applicable for processing massive data sets stored in physically remote locations.
In this work we present a novel distributed algorithm for graph clustering. Most of the previous algorithms only work for graphs with balanced-sized clusters, which restrict their applications in many practical settings. Our proposed algorithm works for graphs with clusters of arbitrary size and its performance is analysed with respect to every individual cluster. In addition, our algorithm is easy to implement, and only requires a poly-logarithmic number of rounds for many graphs occurring in practice.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Every node v can randomly select a number between \(\left[ 1, \mathrm {poly}(n)\right] \), such that, with high probability, those numbers are distinct.
- 2.
The minimum does not play any special role here, it is only used to guarantee consensus among all nodes in the same cluster. The maximum ID works just as fine.
- 3.
For a more detailed discussion of the connection between the sets \(\{f_i\}\), \(\{\chi _{S_i}\}\), \(\{ \widetilde{\chi }_i\}\) we refer the reader to [19].
References
Allen-Zhu, Z., Lattanzi, S., Mirrokni, V.S.: A local algorithm for finding well-connected clusters. In: 30th International Conference on Machine Learning (ICML 2013), pp. 396–404 (2013)
Becchetti, L., Clementi, A.E.F., Manurangsi, P., Natale, E., Pasquale, F., Raghavendra, P., Trevisan, L.: Average whenever you meet: Opportunistic protocols for community detection. In: 26th European Symposium on Algorithms (ESA’18). LIPIcs, vol. 112, pp. 7:1–7:13. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2018). https://doi.org/10.4230/LIPIcs.ESA.2018.7
Becchetti, L., Clementi, A.E.F., Natale, E., Pasquale, F., Trevisan, L.: Find your place: simple distributed algorithms for community detection. SIAM J. Comput. 49(4), 821–864 (2020). https://doi.org/10.1137/19M1243026
Becchetti, L., Cruciani, E., Pasquale, F., Rizzo, S.: Step-by-step community detection in volume-regular graphs. Theoret. Comput. Sci. 847, 49–67 (2020). https://doi.org/10.1016/j.tcs.2020.09.036
Boyd, S.P., Ghosh, A., Prabhakar, B., Shah, D.: Randomized gossip algorithms. IEEE Trans. Inf. Theory 52(6), 2508–2530 (2006). https://doi.org/10.1109/TIT.2006.874516
Chang, Y., Saranurak, T.: Improved distributed expander decomposition and nearly optimal triangle enumeration. In: Robinson, P., Ellen, F. (eds.) Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing, PODC 2019, Toronto, ON, Canada, 29 July–2 August 2019, pp. 66–73. ACM (2019). https://doi.org/10.1145/3293611.3331618
Chang, Y., Saranurak, T.: Deterministic distributed expander decomposition and routing with applications in distributed derandomization. In: 61st Annual IEEE Symposium on Foundations of Computer Science (FOCS 2020), pp. 377–388. IEEE (2020). https://doi.org/10.1109/FOCS46700.2020.00043
Chen, J., Sun, H., Woodruff, D.P., Zhang, Q.: Communication-optimal distributed clustering. In: Lee, D.D., Sugiyama, M., von Luxburg, U., Guyon, I., Garnett, R. (eds.) 29th Advances in Neural Information Processing Systems (NeurIPS 2016), pp. 3720–3728 (2016)
Czumaj, A., Peng, P., Sohler, C.: Testing cluster structure of graphs. In: 47th Annual ACM Symposium on Theory of Computing (STOC 2015), pp. 723–732. ACM (2015). https://doi.org/10.1145/2746539.2746618
Fortunato, S.: Community detection in graphs. Phys. Rep. 486(3–5), 75–174 (2010)
Georgakopoulos, A., Haslegrave, J., Sauerwald, T., Sylvester, J.: The power of two choices for random walks. arXiv preprint arXiv:1911.05170 (2019)
Hui, P., Yoneki, E., Chan, S.Y., Crowcroft, J.: Distributed community detection in delay tolerant networks. In: Proceedings of 2nd ACM/IEEE International Workshop on Mobility in the Evolving Internet Architecture. Association for Computing Machinery (2007). https://doi.org/10.1145/1366919.1366929
Laenen, S., Sun, H.: Higher-order spectral clustering of directed graphs. In: 33rd Advances in Neural Information Processing Systems (NeurIPS 2020) (2020)
Lee, J.R., Gharan, S.O., Trevisan, L.: Multiway spectral partitioning and higher-order Cheeger inequalities. J. ACM 61(6), 37:1–37:30 (2014). https://doi.org/10.1145/2665063
Li, A., Peng, P.: Community structures in classical network models. Internet Math. 7(2), 81–106 (2011). https://doi.org/10.1080/15427951.2011.566458
von Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007). https://doi.org/10.1007/s11222-007-9033-z
Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: 14th Advances in Neural Information Processing Systems (NeurIPS 2021), pp. 849–856. MIT Press (2001)
Oveis Gharan, S., Trevisan, L.: Partitioning into expanders. In: 25th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2014), pp. 1256–1266. SIAM (2014). https://doi.org/10.1137/1.9781611973402.93
Peng, R., Sun, H., Zanetti, L.: Partitioning well-clustered graphs: spectral clustering works!. SIAM J. Comput. 46(2), 710–743 (2017). https://doi.org/10.1137/15M1047209
Sauerwald, T., Zanetti, L.: Random walks on dynamic graphs: mixing times, hitting times, and return probabilities. In: 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019). LIPIcs, vol. 132, pp. 93:1–93:15. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2019). https://doi.org/10.4230/LIPIcs.ICALP.2019.93
Schaeffer, S.E.: Graph clustering. Comput. Sci. Rev. 1(1), 27–64 (2007). https://doi.org/10.1016/j.cosrev.2007.05.001
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000). https://doi.org/10.1109/34.868688
Sun, H., Zanetti, L.: Distributed graph clustering and sparsification. ACM Trans. Parallel Comput. 6(3), 17:1–17:23 (2019). https://doi.org/10.1145/3364208
Yang, W., Xu, H.: A divide and conquer framework for distributed graph clustering. In: 32nd International Conference on Machine Learning (ICML 2015). JMLR Workshop and Conference Proceedings, vol. 37, pp. 504–513 (2015). JMLR.org
Acknowledgements
I would like to thank my supervisor Dr. He Sun for helpful discussion and comments on improving the presentation of this paper.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Manghiuc, BA. (2021). Distributed Detection of Clusters of Arbitrary Size. In: Jurdziński, T., Schmid, S. (eds) Structural Information and Communication Complexity. SIROCCO 2021. Lecture Notes in Computer Science(), vol 12810. Springer, Cham. https://doi.org/10.1007/978-3-030-79527-6_21
Download citation
DOI: https://doi.org/10.1007/978-3-030-79527-6_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-79526-9
Online ISBN: 978-3-030-79527-6
eBook Packages: Computer ScienceComputer Science (R0)