research-article

Modularity-based Hypergraph Clustering: Random Hypergraph Model, Hyperedge-cluster Relation, and Computation

Authors:

Hong ChengAuthors Info & Claims

Proceedings of the ACM on Management of Data, Volume 1, Issue 3

Article No.: 215, Pages 1 - 25

https://doi.org/10.1145/3617335

Published: 13 November 2023 Publication History

Abstract

A graph models the connections among objects. One important graph analytical task is clustering which partitions a data graph into clusters with dense innercluster connections. A line of clustering maximizes a function called modularity. Modularity-based clustering is widely adopted on dyadic graphs due to its scalability and clustering quality which depends highly on its selection of a random graph model. The random graph model decides not only which clustering is preferred - modularity measures the quality of a clustering based on its alignment to the edges of a random graph, but also the cost of computing such an alignment. Existing random hypergraph models either measure the hyperedge-cluster alignment in an All-Or-Nothing (AON) manner, losing important group-wise information, or introduce expensive alignment computation, refraining the clustering from scaling up. This paper proposes a new random hypergraph model called Hyperedge Expansion Model (HEM), a non-AON hypergraph modularity function called Partial Innerclusteredge modularity (PI) based on HEM, a clustering algorithm called Partial Innerclusteredge Clustering (PIC) that optimizes PI, and novel computation optimizations. PIC is a scalable modularity-based hypergraph clustering that can effectively capture the non-AON hyperedge-cluster relation. Our experiments show that PIC outperforms eight state-of-the-art methods on real-world hypergraphs in terms of both clustering quality and scalability and is up to five orders of magnitude faster than the baseline methods.

References

[1]

Sameer Agarwal, Jongwoo Lim, Lihi Zelnik-Manor, Pietro Perona, David~J. Kriegman, and Serge~J. Belongie. 2005. Beyond Pairwise Clustering. In CVPR. 838--845. https://doi.org/10.1109/CVPR.2005.89

Digital Library

[2]

, Charu Aggarwal and Chandan Reddy. 2013. Data clustering: algorithms and applications. https://doi.org/10.1201/9781315373515

[3]

021)]% aghdaei2021hypersf, Ali Aghdaei, Zhiqiang Zhao, and Zhuo Feng. 2021. HyperSF: Spectral Hypergraph Coarsening via Flow-based Local Clustering. In ICCAD. 1--9. https://doi.org/10.1109/ICCAD51958.2021.9643555

Digital Library

[4]

William Aiello, Fan Chung, and Linyuan Lu. 2000. A random graph model for massive graphs. In STOC. 171--180. https://doi.org/10.1145/335305.335326

Digital Library

[5]

Ilya Amburg, Nate Veldt, and Austin Benson. 2020. Clustering in graphs and hypergraphs with categorical edge labels. In WWW. 706--717. https://doi.org/10.1145/3366423.3380152

Digital Library

[6]

Alex Arenas, Alberto Fernandez, and Sergio Gomez. 2008. Analysis of the structure of complex networks at different resolution levels. New journal of physics, Vol. 10, 5 (2008), 053039. https://doi.org/10.1088/1367--2630/10/5/053039

[7]

Austin~R Benson, Rediet Abebe, Michael~T Schaub, Ali Jadbabaie, and Jon Kleinberg. 2018. Simplicial closure and higher-order link prediction. Proc. Natl. Acad. Sci., Vol. 115, 48 (2018), E11221--E11230. https://doi.org/10.1073/pnas.1800683115

[8]

Vincent Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. 2008. Fast Unfolding of Communities in Large Networks. J. Stat. Mech., Vol. 2008, 10 (04 2008), P10008. https://doi.org/10.1088/1742--5468/2008/10/P10008

[9]

Béla Bollobás and Bela Bollobas. 1998. Modern graph theory. Vol., Vol. 184. Springer Science & Business Media. https://doi.org/10.1007/978--1--4612-0619--4

[10]

Thomas Bonald, Nathan de Lara, Quentin Lutz, and Bertrand Charpentier. 2020. Scikit-network: Graph Analysis in Python. J. Mach. Learn., Vol. 21, 185 (2020), 1--6.

[11]

Alain Bretto. 2013. Hypergraph theory. An introduction. Mathematical Engineering. Cham: Springer (2013). https://doi.org/10.1007/978--3--319-00080-0

[12]

Jiajun Bu, Shulong Tan, Chun Chen, Can Wang, Hao Wu, Lijun Zhang, and Xiaofei He. 2010. Music recommendation by unified hypergraph: combining social media information and music content. In ACM Multimedia. 391--400. https://doi.org/10.1145/1873951.1874005

Digital Library

[13]

Philip~S Chodrow. 2020. Configuration models of random hypergraphs. J. Complex Networks, Vol. 8, 3 (08 2020), cnaa018. https://doi.org/10.1093/comnet/cnaa018

[14]

Philip~S. Chodrow, Nate Veldt, and Austin~R. Benson. 2021. Generative hypergraph clustering: From blockmodels to modularity. Science Advances, Vol. 7, 28 (2021), eabh1303. https://doi.org/10.1126/sciadv.abh1303

[15]

Fan Chung and Linyuan Lu. 2002. The average distances in random graphs with given expected degrees. Natl Acad. Sci., Vol. 99, 25 (2002), 15879--15882. https://doi.org/10.1073/pnas.252631999

[16]

Aaron Clauset, Mark~EJ Newman, and Cristopher Moore. 2004. Finding community structure in very large networks. Phys. Rev. E, Vol. 70, 6 (2004), 066111. https://doi.org/10.1103/PhysRevE.70.066111

[17]

Aaron Clauset, Cosma~Rohilla Shalizi, and Mark~EJ Newman. 2009. Power-law distributions in empirical data. SIAM review, Vol. 51, 4 (2009), 661--703. https://doi.org/10.1137/070710111

Digital Library

[18]

Jordi Duch and Alex Arenas. 2005. Community detection in complex networks using extremal optimization. Phys. Rev. E, Vol. 72, 2 (2005), 027104. https://doi.org/10.1103/PhysRevE.72.027104

[19]

Paul ErdHo s, Alfréd Rényi, et al. 1960. On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci, Vol. 5, 1 (1960), 17--60.

[20]

Santo Fortunato. 2010. Community detection in graphs. Phys. Reps., Vol. 486, 3--5 (2010), 75--174. https://doi.org/10.1016/j.physrep.2009.11.002

[21]

Santo Fortunato and Marc Barthelemy. 2007. Resolution limit in community detection. Proceedings of the national academy of sciences, Vol. 104, 1 (2007), 36--41. https://doi.org/10.1073/pnas.0605965104

[22]

Santo Fortunato and Darko Hric. 2016. Community detection in networks: A user guide. Phys. Reps., Vol. 659 (2016), 1--44. https://doi.org/10.1016/j.physrep.2016.09.002

[23]

Kimon Fountoulakis, Pan Li, and Shenghao Yang. 2021. Local hyper-flow diffusion. NeurIPS, Vol. 34 (2021), 27683--27694.

[24]

Koby Hayashi, Sinan~G Aksoy, Cheong~Hee Park, and Haesun Park. 2020. Hypergraph random walks, laplacians, and clustering. In CIKM. 495--504. https://doi.org/10.1145/3340531.3412034

Digital Library

[25]

Einar Hille and Ralph~Saul Phillips. 1996. Functional analysis and semi-groups. Vol., Vol. 31. American Mathematical Soc.

[26]

Dong Huang, Chang-Dong Wang, Jian-Sheng Wu, Jian-Huang Lai, and Chee-Keong Kwoh. 2019. Ultra-scalable spectral clustering and ensemble clustering. TKDE, Vol. 32, 6 (2019), 1212--1226. https://doi.org/10.1109/TKDE.2019.2903410

[27]

Lawrence Hubert and Phipps Arabie. 1985. Comparing partitions. J. Classif., Vol. 2, 1 (1985), 193--218. https://doi.org/10.1007/BF01908075

[28]

Bogumił Kami'nski, Valérie Poulin, Paweł Prałat, Przemysław Szufel, and Francc ois Théberge. 2019. Clustering via hypergraph modularity. PloS one, Vol. 14, 11 (2019), e0224307. https://doi.org/10.1371/journal.pone.0224307

[29]

Min-Soo Kim and Jiawei Han. 2009. A particle-and-density based evolutionary clustering method for dynamic networks. VLDB, Vol. 2, 1 (2009), 622--633. https://doi.org/10.14778/1687627.1687698

Digital Library

[30]

Sungwoong Kim, Sebastian Nowozin, Pushmeet Kohli, and Chang Yoo. 2011. Higher-order correlation clustering for image segmentation. NIPS, Vol. 24 (2011), 1530--1538.

[31]

Larkshmi Krishnamurthy, Joseph Nadeau, Gultekin Ozsoyoglu, M Ozsoyoglu, Greg Schaeffer, Murat Tasan, and Wanhong Xu. 2003. Pathways database system: an integrated system for biological pathways. Bioinformatics, Vol. 19, 8 (2003), 930--937. https://doi.org/10.1093/bioinformatics/btg113

[32]

Tarun Kumar, Sankaran Vaidyanathan, Harini Ananthapadmanabhan, Srinivasan Parthasarathy, and Balaraman Ravindran. 2019. A New Measure of Modularity in Hypergraphs: Theoretical Insights and Implications for Effective Clustering. In Complex Networks, Vol., Vol. 881. 286--297. https://doi.org/10.1007/978--3-030--36687--2_24

[33]

Tarun Kumar, Sankaran Vaidyanathan, Harini Ananthapadmanabhan, Srinivasan Parthasarathy, and Balaraman Ravindran. 2020. Hypergraph clustering by iteratively reweighted modularity maximization. Appl. Netw. Sci., Vol. 5, 1 (2020), 52. https://doi.org/10.1007/s41109-020-00300--3

[34]

Jussi~M Kumpula, Jari Saram"aki, Kimmo Kaski, and János Kertész. 2007. Limited resolution in complex network community detection with Potts model approach. Eur. Phys. J. B, Vol. 56 (2007), 41--45. https://doi.org/10.1140/epjb/e2007-00088--4

[35]

Geon Lee, Minyoung Choe, and Kijung Shin. 2021. How Do Hyperedges Overlap in Real-World Hypergraphs? - Patterns, Measures, and Generators. In WWW. 3396--3407. https://doi.org/10.1145/3442381.3450010

Digital Library

[36]

Jure Leskovec, Kevin~J Lang, Anirban Dasgupta, and Michael~W Mahoney. 2009. Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters. Internet Mathematics, Vol. 6, 1 (2009), 29--123. https://doi.org/10.1080/15427951.2009.10129177

[37]

Lei Li and Tao Li. 2013. News recommendation via hypergraph learning: encapsulation of user behavior and news content. In WSDM. 305--314. https://doi.org/10.1145/2433396.2433436

Digital Library

[38]

Pan Li and Olgica Milenkovic. 2017. Inhomogeneous Hypergraph Clustering with Applications. In NIPS, Vol., Vol. 30. 2308--2318.

[39]

Pan Li and Olgica Milenkovic. 2018. Submodular hypergraphs: p-laplacians, cheeger inequalities and spectral clustering. In ICML. 3014--3023.

[40]

Wentao Li, Miao Qiao, Lu Qin, Lijun Chang, Ying Zhang, and Xuemin Lin. 2022. On Scalable Computation of Graph Eccentricities. In SIGMOD. 904--916. https://doi.org/10.1145/3514221.3517874

Digital Library

[41]

Yibo Lin, Shounak Dhar, Wuxi Li, Haoxing Ren, Brucek Khailany, and David~Z Pan. 2019. Dreamplace: Deep learning toolkit-enabled gpu acceleration for modern vlsi placement. In DAC. 1--6. https://doi.org/10.1145/3316781.3317803

Digital Library

[42]

Meng Liu, Nate Veldt, Haoyu Song, Pan Li, and David~F Gleich. 2021. Strongly local hypergraph diffusions for clustering and semi-supervised learning. In WWW. 2092--2103. https://doi.org/10.1145/3442381.3449887

Digital Library

[43]

Christopher~D. Manning, Prabhakar Raghavan, and Hinrich Schütze. 2008. Introduction to information retrieval. Cambridge University Press. https://doi.org/10.1017/CBO9780511809071

Digital Library

[44]

Mark~EJ Newman. 2006. Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E, Vol. 74, 3 (2006), 036104. https://doi.org/10.1103/PhysRevE.74.036104

[45]

Mark E. J. Newman. 2010. Networks: An Introduction. https://doi.org/10.1093/acprof:oso/9780199206650.001.0001

[46]

Marios Papachristou and Jon Kleinberg. 2022. Core-Periphery Models for Hypergraphs. In KDD. 1337--1347. https://doi.org/10.1145/3534678.3539272

Digital Library

[47]

Emad Ramadan, Arijit Tarafdar, and Alex Pothen. 2004. A hypergraph model for the yeast protein complex network. In IPDPS. 189. https://doi.org/10.1109/IPDPS.2004.1303205

[48]

Jörg Reichardt and Stefan Bornholdt. 2006. Statistical mechanics of community detection. Physical review E, Vol. 74, 1 (2006), 016110. https://doi.org/10.1103/PhysRevE.74.016110

[49]

Satu~Elisa Schaeffer. 2007. Graph clustering. Comput. Sci. Rev., Vol. 1, 1 (2007), 27--64. https://doi.org/10.1016/j.cosrev.2007.05.001

Digital Library

[50]

Jianbo Shi and Jitendra Malik. 2000. Normalized cuts and image segmentation. TPAMI, Vol. 22, 8 (2000), 888--905. https://doi.org/10.1109/34.868688

Digital Library

[51]

Sucheta Soundarajan and John~E. Hopcroft. 2012. Using community information to improve the precision of link prediction methods. In WWW. 607--608. https://doi.org/10.1145/2187980.2188150

Digital Library

[52]

Yuuki Takai, Atsushi Miyauchi, Masahiro Ikeda, and Yuichi Yoshida. 2020. Hypergraph Clustering Based on PageRank. In KDD. 1970--1978. https://doi.org/10.1145/3394486.3403248

Digital Library

[53]

Bertrand Thirion, Gaël Varoquaux, Elvis Dohmatob, and Jean-Baptiste Poline. 2014. Which fMRI clustering gives good brain parcellations? Front. Neurosci., Vol. 8 (2014), 167. https://doi.org/10.3389/fnins.2014.00167

[54]

Nate Veldt, Austin~R Benson, and Jon Kleinberg. 2020a. Minimizing localized ratio cut objectives in hypergraphs. In KDD. 1708--1718. https://doi.org/10.1145/3394486.3403222

Digital Library

[55]

Nate Veldt, Anthony Wirth, and David~F Gleich. 2020b. Parameterized correlation clustering in hypergraphs and bipartite graphs. In KDD. 1868--1876. https://doi.org/10.1145/3394486.3403238

Digital Library

[56]

Ulrike Von~Luxburg. 2007. A tutorial on spectral clustering. Statistics and computing, Vol. 17 (2007), 395--416. https://doi.org/10.1007/s11222-007--9033-z

[57]

Zhiqiang Xu, Yiping Ke, Yi Wang, Hong Cheng, and James Cheng. 2012. A model-based approach to attributed graph clustering. In SIGMOD. 505--516. https://doi.org/10.1145/2213836.2213894

Digital Library

[58]

Hao Yin, Austin~R Benson, Jure Leskovec, and David~F Gleich. 2017. Local higher-order graph clustering. In KDD. 555--564. https://doi.org/10.1145/3097983.3098069

Digital Library

[59]

Chen Zhe, Aixin Sun, and Xiaokui Xiao. 2019. Community Detection on Large Complex Attribute Network. In KDD. 2041--2049. https://doi.org/10.1145/3292500.3330721

Digital Library

[60]

Dengyong Zhou, Jiayuan Huang, and Bernhard Schö lkopf. 2006. Learning with Hypergraphs: Clustering, Classification, and Embedding. In NeurIPS. 1601--1608. https://doi.org/10.7551/mitpress/7503.003.0205 io

Cited By

Stumpf M(2024)Hypergraph animalsPhysical Review E10.1103/PhysRevE.110.044125110:4Online publication date: 17-Oct-2024
https://doi.org/10.1103/PhysRevE.110.044125
Wang QFeng YWang YLi BWen JZhou XSong Q(2024)AntiFormer: graph enhanced large language model for binding affinity predictionBriefings in Bioinformatics10.1093/bib/bbae40325:5Online publication date: 20-Aug-2024
https://doi.org/10.1093/bib/bbae403

Index Terms

Modularity-based Hypergraph Clustering: Random Hypergraph Model, Hyperedge-cluster Relation, and Computation

Recommendations

Turán theorems for even cycles in random hypergraph
Abstract
Let F be a family of r-uniform hypergraphs. The random Turán number ex ( G n , p r , F ) is the maximum number of edges in an F-free subgraph of G n , p r, where G n , p r is the Erdős-Rényi random r-graph with parameter p. Let C ℓ r denote the r-...
Hypergraph Based Berge Hypergraphs
Abstract
Fix a hypergraph $F$ . A hypergraph $H$ is called a Berge copy of $F$ or Berge- $F$ if we can choose a subset of each hyperedge of $H$ to obtain a copy of $F$ . A hypergraph $H$ is Berge- $F$ -free if it does not contain a subhypergraph which is Berge copy of $F$ . This ... $^{}$
HyperGraph Convolution Based Attributed HyperGraph Clustering
CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

Attributed Graph Clustering (AGC) and Attributed Hypergraph Clustering (AHC) are important topics in graph mining with many applications. For AGC, amongst the unsupervised methods that combine the graph structure with node attributes, graph convolution ...

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Management of Data

Proceedings of the ACM on Management of Data Volume 1, Issue 3

PACMMOD

September 2023

472 pages

EISSN:2836-6573

DOI:10.1145/3632968

Editor:
Divyakant Agrawal
UC Santa Barbara, United States

Issue’s Table of Contents

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 November 2023

Published in PACMMOD Volume 1, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Author Tags

Qualifiers

Research-article

Funding Sources

Marsden Fund
National Natural Science Foundation of China
The Shun Hing Institute of Advanced Engineering, The Chinese University of Hong Kong
Ministry of Business, Innovation and Employment, New Zealand
Research Grants Council, University Grants Committee, Hong Kong

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
350
Total Downloads

Downloads (Last 12 months)276
Downloads (Last 6 weeks)13

Reflects downloads up to 25 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Stumpf M(2024)Hypergraph animalsPhysical Review E10.1103/PhysRevE.110.044125110:4Online publication date: 17-Oct-2024
https://doi.org/10.1103/PhysRevE.110.044125
Wang QFeng YWang YLi BWen JZhou XSong Q(2024)AntiFormer: graph enhanced large language model for binding affinity predictionBriefings in Bioinformatics10.1093/bib/bbae40325:5Online publication date: 20-Aug-2024
https://doi.org/10.1093/bib/bbae403

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents