Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Incremental Community Detection on Large Complex Attributed Network

Published: 19 May 2021 Publication History

Abstract

Community detection on network data is a fundamental task, and has many applications in industry. Network data in industry can be very large, with incomplete and complex attributes, and more importantly, growing. This calls for a community detection technique that is able to handle both attribute and topological information on large scale networks, and also is incremental. In this article, we propose inc-AGGMMR, an incremental community detection framework that is able to effectively address the challenges that come from scalability, mixed attributes, incomplete values, and evolving of the network. Through construction of augmented graph, we map attributes into the network by introducing attribute centers and belongingness edges. The communities are then detected by modularity maximization. During this process, we adjust the weights of belongingness edges to balance the contribution between attribute and topological information to the detection of communities. The weight adjustment mechanism enables incremental updates of community membership of all vertices. We evaluate inc-AGGMMR on five benchmark datasets against eight strong baselines. We also provide a case study to incrementally detect communities on a PayPal payment network which contains users with transactions. The results demonstrate inc-AGGMMR’s effectiveness and practicability.

References

[1]
D. Arthur and S. Vassilvitskii. 2007. k-means++: The advantages of careful seeding. In ACM-SIAM Symposium on Discrete Algorithms. 1027–1035.
[2]
N. Barbieri, F. Bonchi, and G. Manco. 2013. Influence-based network-oblivious community detection. In 2013 IEEE 13th International Conference on Data Mining. 955–960.
[3]
Y. Bian, J. Ni, W. Cheng, and X. Zhang. 2017. Many heads are better than one: Local community detection by the multi-walker chain. In 2017 IEEE International Conference on Data Mining. 21–30.
[4]
V. D Blondel, J.-L. Guillaume, R. Lambiotte, and E. Lefebvre. 2008. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment 2008, 10 (2008), P10008.
[5]
X. Cai, F. Nie, and H. Huang. 2013. Multi-view K-means clustering on big data. In 23rd International Joint Conference on Artificial Intelligence. 2598–2604.
[6]
Sanjay Chakraborty and N.K. Nagwani. 2011. Analysis and study of incremental K-means clustering algorithm. In International Conference on High Performance Architecture and Grid Computing. 338–341.
[7]
Archana Chaudhari and Preeti Mulay. 2019. A bibliometric survey on incremental clustering algorithm for electricity smart meter data analysis. Iran Journal of Computer Science 2, 4 (2019), 197–206.
[8]
Hong Cheng, Yang Zhou, Xin Huang, and Jeffrey Xu Yu. 2012. Clustering large attributed information networks: An efficient incremental computing approach. Data Mining and Knowledge Discovery 25, 3 (2012), 450–477.
[9]
Mário Cordeiro, Rui Portocarrero Sarmento, and Joao Gama. 2016. Dynamic community detection in evolving networks using locality modularity optimization. Social Network Analysis and Mining 6, 1 (2016), 15.
[10]
T. Dang and E. Viennet. 2012. Community detection based on structural and attribute similarities. In International Conference on Digital Society. 7–12.
[11]
H. Dev. 2014. A user interaction based community detection algorithm for online social networks. In 2014 ACM SIGMOD International Conference on Management of Data. ACM, 1607–1608.
[12]
M. Ester, H.-P. Kriegel, J. Sander, and X Xu. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. In 2nd International Conference on Knowledge Discovery and Data Mining. 226–231.
[13]
I. Falih, N. Grozavu, R. Kanawati, and Y. Bennani. 2018. Community detection in attributed network. In Web Conference 2018. 1299–1306.
[14]
A. Geyer-Schulz and M. Ovelgönne. 2014. The randomized greedy modularity clustering algorithm and the core groups graph clustering scheme. In German-Japanese Interchange of Data Analysis Results. Springer, 17–36.
[15]
M. Girvan and M. E. J. Newman. 2002. Community structure in social and biological networks. Proceedings of the National Academy of Sciences of the United States of America 99, 12 (2002), 7821–7826.
[16]
Nidhi Gupta and R.L. Ujjwal. 2013. An efficient incremental clustering algorithm. World of Computer Science and Information Technology Journal 3, 5 (2013), 97–99.
[17]
Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In Advances in Neural Information Processing Systems. 1024–1034.
[18]
Y. Huang and H. Wang. 2016. Consensus and multiplex approach for community detection in attributed networks. In 2016 IEEE Global Conference on Signal and Information Processing. 425–429.
[19]
Z. Huang. 1997. Clustering large data sets with mixed numeric and categorical values. In 1st Pacific-Asia Conference on Knowledge Discovery and Data Mining. 21–34.
[20]
C. Jia, Y. Li, M. B Carson, X. Wang, and J. Yu. 2017. Node attribute-enhanced community detection in complex networks. Scientific Reports 7, 1 (2017), 2626.
[21]
Thomas N. Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks. In Proceedings of the 5th International Conference on Learning Representations (ICLR’17).
[22]
Jure Leskovec and Rok Sosič. 2016. SNAP: A general-purpose network analysis and graph-mining library. ACM Transactions on Intelligent Systems and Technology 8, 1 (2016), 1.
[23]
Y. Li, C. Sha, X. Huang, and Y. Zhang. 2018. Community detection in attributed graphs: An embedding approach. In 32nd AAAI Conference on Artificial Intelligence.
[24]
S. Lim, J. Kim, and J.-G. Lee. 2016. BlackHole: Robust community detection inspired by graph drawing. In 2016 IEEE 32nd International Conference on Data Engineering. 25–36.
[25]
L. Liu, L. Xu, Z. Wangy, and Enhong C.2015. Community detection based on structure and content: A content propagation perspective. In 2015 IEEE International Conference on Data Mining. 271–280.
[26]
Yongli Liu, Yuanxin Ouyang, and Zhang Xiong. 2011. Incremental clustering using information bottleneck theory. International Journal of Pattern Recognition and Artificial Intelligence 25, 05 (2011), 695–712.
[27]
S. Maekawa, K. Takeuch, and M. Onizuka. 2018. Non-linear attributed graph clustering by symmetric NMF with PU learning. arXiv preprint arXiv:1810.00946 (2018).
[28]
Seiji Maekawa, Jianpeng Zhang, George Fletcher, and Makoto Onizuka. 2019. General generator for attributed graphs with community structure. In Proceeding of the ECML/PKDD Graph Embedding and Mining Workshop. 1–5.
[29]
A. Mahmood and M. Small. 2016. Subspace based network community detection using sparse linear coding. In 2016 IEEE International Conference on Data Engineering. 1502–1503.
[30]
Renny Márquez. 2020. Overlapping community detection in static and dynamic networks. In 13th International Conference on Web Search and Data Mining. 925–926.
[31]
M.E.J. Newman. 2004. Detecting community structure in networks. The European Physics Journal B 38, 2 (2004), 321–330.
[32]
M.E.J. Newman. 2006. Modularity and community structure in networks. Proceedings of the National Academy of Sciences of the United States of America 103, 23 (2006), 8577–8582.
[33]
Nam P. Nguyen, Thang N. Dinh, Yilin Shen, and My T. Thai. 2014. Dynamic social community detection and its applications. PloS One 9, 4 (2014), e91431.
[34]
Gang Pan, Wangsheng Zhang, Zhaohui Wu, and Shijian Li. 2014. Online community detection for large complex networks. PloS One 9, 7 (2014), e102799.
[35]
Duc Truong Pham, Stefan Simeonov Dimov, and CD Nguyen. 2004. An incremental K-means algorithm. Proceedings of the Institution of Mechanical Engineers, Part C: Journal of Mechanical Engineering Science 218, 7 (2004), 783–795.
[36]
J. M. Phillips and S. Venkatasubramanian. 2011. A gentle introduction to the kernel distance. arXiv preprint arXiv:1103.1625 (2011).
[37]
M. J. Rattigan, M. Maier, and D. Jensen. 2007. Graph clustering with network structure indices. In 24th International Conference on Machine Learning.
[38]
Y. Ruan, D. Fuhry, and S. Parthasarathy. 2013. Efficient community detection in large networks using content and links. In WWW. 1089–1098.
[39]
P. I. Sánchez, E. Müller, U. L. Korn, K. Böhm, A. Kappes, T. Hartmann, and D. Wagner. 2015. Efficient algorithms for a robust modularity-driven clustering of attributed graphs. In 2015 SIAM International Conference on Data Mining. 100–108.
[40]
B. Schölkopf, A. Smola, and K.-R. Müller. 1998. Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation 10, 5 (1998), 1299–1319.
[41]
P. Sen, G. Namata, M. Bilgic, L. Getoor, B. Galligher, and T. Eliassi-Rad. 2008. Collective classification in network data. AI Magazine 29, 3 (2008), 93.
[42]
J. Shao, Z. Han, Q. Yang, and T. Zhou. 2015. Community detection based on distance dynamics. In 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1075–1084.
[43]
X.-B. Shen, W. Liu, I. W Tsang, F. Shen, and Q.-S. Sun. 2017. Compressed K-means for large-scale clustering. In 31st AAAI Conference on Artificial Intelligence. 2527–2533.
[44]
H. Shiokawa, Y. Fujiwara, and M. Onizuka. 2013. Fast algorithm for modularity-based graph clustering. In AAAI Conference on Artificial Intelligence. 1170–1176.
[45]
P. L. Szczepanski, A. S. Barcz, T. P. Michalak, and T. Rahwan. 2015. The game-theoretic interaction index on social networks with applications to link prediction and community detection. In 24th International Joint Conference on Artificial Intelligence. 638–644.
[46]
X. Wang, J. Song, K. Lu, and X. Wang. 2017. Community detection in attributed networks based on heterogeneous vertex interactions. Applied Intelligence 47, 4 (2017), 1270–1281.
[47]
Y. Yamaguchi and K. Hayashi. 2017. When does label propagation fail? A view from a network generative model. In 26th International Joint Conference on Artificial Intelligence. 3224–3230.
[48]
J. Yang, J. McAuley, and J. Leskovec. 2013. Community detection in networks with node attributes. In 2013 IEEE International Conference on Data Mining. 1151–1156.
[49]
Neda Zarayeneh and Ananth Kalyanaraman. 2019. A fast and efficient incremental approach toward dynamic community detection. In 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. 9–16.
[50]
Zhongying Zhao, Chao Li, Xuejian Zhang, Francisco Chiclana, and Enrique Herrera Viedma. 2019. An incremental method to detect communities in dynamic evolving social networks. Knowledge-Based Systems 163 (2019), 404–415. https://doi.org/10.1016/j.knosys.2018.09.002
[51]
Chen Zhe, Aixin Sun, and Xiaokui Xiao. 2019. Community detection on large complex attribute network. In 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2041–2049.
[52]
Y. Zhou, H. Cheng, and J. X. Yu. 2009. Graph clustering based on structural/attribute similarities. Proceedings of the VLDB Endowment 2, 1 (2009), 718–729.
[53]
X. Zhu and Z. Ghahramani. 2002. Learning from Labeled and Unlabeled Data with Label Propagation. Technical Report CMU-CALD-02-107. Carnegie Mellon University.

Cited By

View all
  • (2024)Misinformation, Disinformation, and Generative AI: Implications for Perception and PolicyDigital Government: Research and Practice10.1145/3689372Online publication date: 23-Aug-2024
  • (2024)Multi-Order Clustering on Dynamic Networks: On Error Accumulation and Its EliminationIEEE INFOCOM 2024 - IEEE Conference on Computer Communications10.1109/INFOCOM52122.2024.10621124(1950-1959)Online publication date: 20-May-2024
  • (2024)Construction of Knowledge Map and Intelligent Recommendation Algorithm of College Specialized Basic Courses Based On Deep Neural Network and Wechat AppletProcedia Computer Science10.1016/j.procs.2024.09.092243(766-774)Online publication date: 2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data
ACM Transactions on Knowledge Discovery from Data  Volume 15, Issue 6
June 2021
474 pages
ISSN:1556-4681
EISSN:1556-472X
DOI:10.1145/3465438
  • Editor:
  • Charu Aggarwal
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 May 2021
Accepted: 01 February 2021
Revised: 01 January 2021
Received: 01 July 2020
Published in TKDD Volume 15, Issue 6

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Social network
  2. payment network
  3. attributed network
  4. community detection

Qualifiers

  • Research-article
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)46
  • Downloads (Last 6 weeks)3
Reflects downloads up to 17 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Misinformation, Disinformation, and Generative AI: Implications for Perception and PolicyDigital Government: Research and Practice10.1145/3689372Online publication date: 23-Aug-2024
  • (2024)Multi-Order Clustering on Dynamic Networks: On Error Accumulation and Its EliminationIEEE INFOCOM 2024 - IEEE Conference on Computer Communications10.1109/INFOCOM52122.2024.10621124(1950-1959)Online publication date: 20-May-2024
  • (2024)Construction of Knowledge Map and Intelligent Recommendation Algorithm of College Specialized Basic Courses Based On Deep Neural Network and Wechat AppletProcedia Computer Science10.1016/j.procs.2024.09.092243(766-774)Online publication date: 2024
  • (2023)Efficient Community Search in Edge-Attributed GraphsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.326755035:10(10790-10806)Online publication date: 1-Oct-2023
  • (2023)Modeling and Detecting Communities in Node Attributed NetworksIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.319761235:7(7206-7219)Online publication date: 1-Jul-2023
  • (2023)Measures and Optimization for Robustness and Vulnerability in Disconnected NetworksIEEE Transactions on Information Forensics and Security10.1109/TIFS.2023.327997918(3350-3362)Online publication date: 1-Jan-2023
  • (2023)A Graph Convolutional Neural Network for Recommendation Based on Community Detection and Combination of Multiple Heterogeneous Graphs2023 IEEE International Conference on Data Mining (ICDM)10.1109/ICDM58522.2023.00154(1235-1240)Online publication date: 1-Dec-2023
  • (2022)State-of-the-Art in Community Detection in Temporal NetworksArtificial Intelligence Applications and Innovations. AIAI 2022 IFIP WG 12.5 International Workshops10.1007/978-3-031-08341-9_30(370-381)Online publication date: 10-Jun-2022

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media