Abstract
Community detection (or clustering) in large-scale graphs is an important problem in graph mining. Communities reveal interesting organizational and functional characteristics of a network. Louvain algorithm is an efficient sequential algorithm for community detection. However, such sequential algorithms fail to scale for emerging large-scale data. Scalable parallel algorithms are necessary to process large graph datasets. In this work, we show a comparative analysis of our different parallel implementations of Louvain algorithm. We design parallel algorithms for Louvain method in shared memory and distributed memory settings. Developing distributed memory parallel algorithms is challenging because of inter-process communication and load balancing issues. We incorporate dynamic load balancing in our final algorithm DPLAL (Distributed Parallel Louvain Algorithm with Load-balancing). DPLAL overcomes the performance bottleneck of the previous algorithms and shows around 12-fold speedup scaling to a larger number of processors. We also compare the performance of our algorithm with some other prominent algorithms in the literature and get better or comparable performance . We identify the challenges in developing distributed memory algorithm and provide an optimized solution DPLAL showing performance analysis of the algorithm on large-scale real-world networks from different domains.
Similar content being viewed by others
References
Arifuzzaman S, Khan M, Marathe M (2020) Fast parallel algorithms for counting and listing triangles in big graphs. ACM Trans Knowl Disc Data (TKDD) 14(1):1–34. https://doi.org/10.1145/3365676
Arifuzzaman S, Khan M, Marathe M (2013) Patric: A parallel algorithm for counting triangles in massive networks. In: Proceedings of the 22nd ACM international conference on Information & Knowledge Management, pp. 529–538. ACM . https://doi.org/10.1145/2505515.2505545
Arifuzzaman S, Khan M, Marathe M (2015) A fast parallel algorithm for counting triangles in graphs using dynamic load balancing. In: 2015 IEEE International Conference on Big Data (Big Data), pp. 1839–1847. IEEE . https://doi.org/10.1109/BigData.2015.7363957
Arifuzzaman S, Khan M, Marathe M(2015) A space-efficient parallel algorithm for counting exact triangles in massive networks. In: 2015 IEEE 17th International Conference on High Performance Computing and Communications (HPCC), pp. 527–534. IEEE . https://doi.org/10.1109/HPCC-CSS-ICESS.2015.301
Arifuzzaman S, Pandey B (2017) Scalable mining and analysis of protein-protein interaction networks. In: 3rd Intl Conf on Big Data Intelligence and Computing (DataCom 2017), pp. 1098–1105. IEEE . https://doi.org/10.1109/DASC-PICom-DataCom-CyberSciTec.2017.178
Bhowmick S, Srinivasan S (2013) A template for parallelizing the Louvain method for modularity maximization dynamics on and of complex networks. Springer, New York, pp 111–124. https://doi.org/10.1007/978-1-4614-6729-8_6
Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Statist Mech: Theor Exp 10:P10008. https://doi.org/10.1088/1742-5468/2008/10/p10008
Brandes U, Delling D, Gaertler M, Görke R, Hoefer M, Nikoloski Z, Wagner D(2006) Maximizing modularity is hard. arXiv: 0608255
Clauset A, Newman ME, Moore C (2004) Finding community structure in very large networks. Phys Rev E. https://doi.org/10.1103/PhysRevE.70.066111
Cray documentation portal. https://pubs.cray.com/content/S-3014/3.0.UP00/cray-graph-engine-user-guide/community-detection-parallel-louvain-method-plm
Documentation | user guides | qb2. http://www.hpc.lsu.edu/docs/guides.php? system=QB2
Faysal MAM, Arifuzzaman S (2019) Distributed community detection in large networks using an information-theoretic approach. In: 2019 IEEE International Conference on Big Data (Big Data), pp. 4773–4782. IEEE
Fazlali M, Moradi E, Malazi HT (2017) Adaptive parallel Louvain community detection on a multicore platform. Microprocess Microsyst 54:26–34. https://doi.org/10.1016/j.micpro.2017.08.002
Fortunato S (2010) Community detection in graphs. Phys Rep 486(3–5):75–174. https://doi.org/10.1016/j.physrep.2009.11.002
Garcia JO, Ashourvan A, Muldoon S, Vettel JM, Bassett DS (2018) Applications of community detection techniques to brain graphs: algorithmic considerations and implications for neural function. Proc IEEE 106(5):846–867. https://doi.org/10.1109/JPROC.2017.2786710
Ghosh S, Halappanavar M, Tumeo A, Kalyanaraman A, Gebremedhin AH (2018) Scalable distributed memory community detection using vite. In: 2018 IEEE High Performance extreme Computing Conference (HPEC), pp. 1–7. IEEE . https://doi.org/10.1109/HPEC.2018.8547534
Ghosh S, Halappanavar M, Tumeo A, Kalyanaraman A, Lu H, Chavarria-Miranda D, Khan A, Gebremedhin A (2018) Distributed Louvain algorithm for graph community detection. In: 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 885–895. IEEE . https://doi.org/10.1109/IPDPS.2018.00098
Girvan M, Newman ME (2002) Community structure in social and biological networks. Proc Natl Acad Sci 99(12):7821–7826. https://doi.org/10.1073/pnas.122653799
Halappanavar M, Lu H, Kalyanaraman A, Tumeo A (2017) Scalable static and dynamic community detection using grappolo. In: High Performance Extreme Computing Conference (HPEC), 2017 IEEE, pp. 1–6. IEEE https://doi.org/10.1109/HPEC.2017.8091047
Hashmi JM, Xu S, Ramesh B, Bayatpour M, Subramoni H, Panda DKD (2020) Machine-agnostic and communication-aware designs for mpi on emerging architectures. In: 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 32–41. IEEE https://doi.org/10.1109/IPDPS47924.2020.00014
Jiang Y, Jia C, Yu J (2014) An efficient community detection algorithm using greedy surprise maximization. J Phys A: Math Theor 47(16):165101
Kwak H, Lee C, Park H, Moon S (2010) What is twitter, a social network or a news media? In: Proceedings of the 19th International Conference on World wide web, pp. 591–600. AcM https://doi.org/10.1145/1772690.1772751
Lalwani D, Somayajulu DV, Krishna PR (2015) A community driven social recommendation system. In: 2015 IEEE International Conference on Big Data (Big Data), pp. 821–826. IEEE . https://doi.org/10.1109/BigData.2015.7363828
Lancichinetti A, Fortunato S (2009) Community detection algorithms: a comparative analysis. Phys Rev E. https://doi.org/10.1103/PhysRevE.80.056117
Lee Y, Lee Y, Seong J, Stanescu A, Hwang CS (2020) A comparison of network clustering algorithms in keyword network analysis: a case study with geography conference presentations. Int J Geosp Environ Res 7(3):1
Leskovec J, Lang KJ, Mahoney M (2010) Empirical comparison of algorithms for network community detection. In: Proceedings of the 19th International Conference on World wide web, pp. 631–640. ACM . https://doi.org/10.1145/1772690.1772755
Li Z, Zhang S, Wang RS, Zhang XS, Chen L (2008) Quantitative function for community detection. Phys Rev E. https://doi.org/10.1103/PhysRevE.77.036109
Low TM, Spampinato DG, McMillan S, Pelletier M (2020) Linear algebraic Louvain method in python. In: 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 223–226. IEEE . https://doi.org/10.1109/IPDPSW50202.2020.00050
Lu H, Halappanavar M, Kalyanaraman A (2015) Parallel heuristics for scalable community detection. Parallel Comput 47:19–37. https://doi.org/10.1016/j.parco.2015.03.003
Makris C, Pettas D, Pispirigos G (2019) Distributed community prediction for social graphs based on Louvain algorithm. In: IFIP International Conference on Artificial Intelligence Applications and Innovations, pp. 500–511. Springer https://doi.org/10.1007/978-3-030-19823-7_42
Metis - serial graph partitioning and fill-reducing matrix ordering|karypis lab. http://glaros.dtc.umn.edu/gkhome/metis/metis/overview
Mohammadi M, Fazlali M, Hosseinzadeh M (2021) Accelerating Louvain community detection algorithm on graphic processing unit. J Supercomput 77(6):6056–6077. https://doi.org/10.1007/s11227-020-03510-9
Moradi E, Fazlali M, Malazi HT (2015) Fast parallel community detection algorithm based on modularity. In: 2015 18th CSI International Symposium on Computer Architecture and Digital Systems (CADS), pp. 1–4. IEEE https://doi.org/10.1109/CADS.2015.7377794
Mosadegh MJ, Behboudi M (2011) Using social network paradigm for developing a conceptual framework in crm. Aust J Bus Manag Res 1(4):63
Naim M, Manne F, Halappanavar M, Tumeo A (2017) Community detection on the gpu. In: 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 625–634. IEEE . https://doi.org/10.1109/IPDPS.2017.16
Pinheiro CAR (2012) Community detection to identify fraud events in telecommunications networks. SAS SUGI proceedings: customer intelligence
Pons P, Latapy M (2005) Computing communities in large networks using random walks. In: International symposium on computer and information sciences, pp. 284–293. Springer
Que X, Checconi F, Petrini F, Gunnels JA (2015) Scalable community detection with the Louvain algorithm. In: 2015 IEEE International Parallel and Distributed Processing Symposium, pp. 28–37. IEEE https://doi.org/10.1109/IPDPS.2015.59
Raghavan UN, Albert R, Kumara S (2007) Near linear time algorithm to detect community structures in large-scale networks. Phys Rev E 76(3):036106
Raghavan UN, Albert R, Kumara S (2007) Near linear time algorithm to detect community structures in large-scale networks. Phys Rev E. https://doi.org/10.1103/PhysRevE.76.036106
Raval A, Nasre R, Kumar V, Vadhiyar S, Pingali K., et al (2017) Dynamic load balancing strategies for graph applications on gpus. arXiv preprint arXiv:1711.00231
Remy C, Rym B, Matthieu L (2017) Tracking bitcoin users activity using community detection on a network of weak signals. In: International Conference on complex networks and their applications, pp. 166–177. Springer . https://doi.org/10.1007/978-3-319-72150-7_14
Rosvall M, Bergstrom CT (2007) An information-theoretic framework for resolving community structure in complex networks. Proc Natl Acad Sci 104(18):7327–7331
Sarvari H, Abozinadah E, Mbaziira A, Mccoy D (2014) Constructing and analyzing criminal networks. In: 2014 IEEE Security and Privacy Workshops, pp. 84–91. IEEE . https://doi.org/10.1109/SPW.2014.22
Sattar NS (2019) Scalable community detection using distributed Louvain algorithm. https://scholarworks.uno.edu/td/2640/
Sattar NS, Arifuzzaman S (2020) Data parallel large sparse deep neural network on gpu. In: 2020 IEEE international parallel and distributed processing symposium workshops (IPDPSW). IEEE, pp 1–9. https://doi.org/10.1109/IPDPSW50202.2020.00170
Sattar NS, Arifuzzaman S (2019) Understanding performance bottleneck to improve parallel efficiency of Louvain algorithm. In: PDSW-DISCS workshop, 2019 international conference for high performance computing, networking, storage, and analysis (SC’19). https://www.pdsw.org/pdsw19/wips/NawSafrinSattar-pdswWIP.pdf
Sattar NS, Arifuzzaman S (2018) Overcoming MPI communication overhead for distributed community detection. In: Workshop on software challenges to exascale computing. Springer, pp 77–90. https://doi.org/10.1007/978-981-13-7729-7_6
Sattar NS, Arifuzzaman S (2018) Parallelizing Louvain algorithm: Distributed memory challenges. In: 2018 IEEE 16th International Conference on Dependable, Autonomic and Secure Computing (DASC 2018), pp. 695–701. IEEE . https://doi.org/10.1109/DASC/PiCom/DataCom/CyberSciTec.2018.00122
Sattar NS, Arifuzzaman S (2020) Community detection using semi-supervised learning with graph convolutional network on GPUs. In: 2020 IEEE international conference on big data (Big Data). IEEE, pp 5237–5246. https://doi.org/10.1109/BigData50022.2020.9378123
Sattar NS, Arifuzzaman S (2021) COVID-19 vaccination awareness and aftermath: public sentiment analysis on Twitter data and vaccinated population prediction in the USA. Appl Sci 11(13):6128
Sattar NS, Arifuzzaman S, Zibran MF, Sakib MM (2019) Detecting web spam in webgraphs with predictive model analysis. In: 2019 IEEE International Conference on Big Data (Big Data), pp. 4299–4308. IEEE. https://doi.org/10.1109/BigData47090.2019.9006282
Shende SS, Malony AD (2006) The tau parallel performance system. Int J High Perform Comput Appl 20(2):287–311. https://doi.org/10.1177/1094342006064482
Stanford large network dataset collection. https://snap.stanford.edu/data/index.html
Staudt CL, Meyerhenke H (2016) Engineering parallel algorithms for community detection in massive networks. IEEE Trans Parallel Distrib Syst 1:1–1. https://doi.org/10.1109/TPDS.2015.2390633
Talukder N, Zaki M.J (2016) Parallel graph mining with dynamic load balancing. In: Big Data (Big Data), 2016 IEEE International Conference on, pp. 3352–3359. IEEE . https://doi.org/10.1109/BigData.2016.7840995
Tithi JJ, Stasiak A, Aananthakrishnan S, Petrini F (2020) Prune the unnecessary: Parallel pull-push Louvain algorithms with automatic edge pruning. In: 49th International Conference on Parallel Processing-ICPP, pp. 1–11 . https://doi.org/10.1145/3404397.3404455
Ugander J, Karrer B, Backstrom L, Marlow C (2011) The anatomy of the facebook social graph. arXiv preprint arXiv:1111.4503
Wang S, Gong M, Liu W, Wu Y (2020) Preventing epidemic spreading in networks by community detection and memetic algorithm. Appl Soft Comput. https://doi.org/10.1016/j.asoc.2020.106118
Waskiewicz T (2012) Friend of a friend influence in terrorist social networks. In: Proceedings on the International Conference on Artificial Intelligence (ICAI), p. 1. The Steering Committee of The World Congress in Computer Science, Computer
Wickramaarachchi C, Frincuy M, Small P, Prasannay V (2014) Fast parallel algorithm for unfolding of communities in large graphs. In: High Performance Extreme Computing Conference (HPEC), 2014 IEEE, pp. 1–6. IEEE . https://doi.org/10.1109/HPEC.2014.7040973
Zhou X, Zafarani R (2019) Network-based fake news detection: a pattern-driven approach. ACM SIGKDD Explor Newsl 21(2):48–60. https://doi.org/10.1145/3373464.3373473
Acknowledgements
This work has been supported by Louisiana Board of Regents RCS Grant LEQSF (2017-20)-RDA-25. We also thank the anonymous reviewers for the helpful comments and suggestions to improve this paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Sattar, N.S., Arifuzzaman, S. Scalable distributed Louvain algorithm for community detection in large graphs. J Supercomput 78, 10275–10309 (2022). https://doi.org/10.1007/s11227-021-04224-2
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-021-04224-2