Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

A Lovász-Simonovits Theorem for Hypergraphs with Application to Local Clustering

Published: 30 September 2024 Publication History

Abstract

We present the first analysis of diffusion on hypergraphs based on the Lovász-Simonovits theory. We demonstrate that an averaging-based diffusion operator is the appropriate generalization of the lazy random walk diffusion on 2-graphs because the diffusion rapidly converges to its stationary state from any initial state. By proving a Lovász-Simonovits-like theorem for this diffusion, we show that the diffusion rate depends on the hypergraph's conductance. To use averaging-based diffusion for clustering, we define a generalization of personalized page rank for hypergraphs, which we call ''Averaging-based Personalized Page Rank for Hypergraphs'' (APPRH). The fact that averaging-based diffusion is linear, unlike previous hypergraph diffusions used for clustering in the literature, allows us to use the Forward Push algorithm to compute APPRH efficiently. Using this method, we obtain theoretical bounds for the conductance of our clustering that are at least a constant times better than the best-known bounds in the literature. We compare our algorithm A-HyperCut against baselines on million-scale hypergraphs and find that our method is an order of magnitude faster while being competitive regarding the conductance of the local clusters produced.

References

[1]
Sameer Agarwal, Kristin Branson, and Serge Belongie. 2006. Higher order learning with graphs. In Proceedings of the 23rd international conference on Machine learning. 17--24.
[2]
Sameer Agarwal, Jongwoo Lim, Lihi Zelnik-Manor, Pietro Perona, David Kriegman, and Serge Belongie. 2005. Beyond pairwise clustering. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), Vol. 2. IEEE, 838--845.
[3]
Charles J Alpert, Jen-Hsin Huang, and Andrew B Kahng. 1997. Multilevel circuit partitioning. In Proceedings of the 34th annual Design Automation Conference. 530--533.
[4]
Reid Andersen, Fan Chung, and Kevin Lang. 2006. Local graph partitioning using pagerank vectors. In 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06). IEEE, 475--486.
[5]
Reid Andersen, Fan Chung, and Kevin Lang. 2007. Using pagerank to locally partition a graph. Internet Mathematics, Vol. 4, 1 (2007), 35--64.
[6]
Sebastien Ardon, Amitabha Bagchi, Anirban Mahanti, Aaditeshwar Seth Amit Ruhela, Rudra Mohan Tripathy, and Sipat Triukose. 2013. Spatio-Temporal and Events-based Analysis of Topic Popularity in Twitter. In Proceedings of the 22nd ACM International Conference on Information and Knowledge Management (CIKM '13). 219--228.
[7]
Bahman Bahmani, Abdur Chowdhury, and Ashish Goel. 2010. Fast incremental and personalized PageRank. Proceedings of the VLDB Endowment, Vol. 4, 3 (2010), 173--184.
[8]
Anirban Banerjee. 2021. On the spectrum of hypergraphs. Linear Algebra Appl., Vol. 614 (2021), 82--110.
[9]
Abdelghani Bellaachia and Mohammed Al-Dhelaan. 2013. Random walks in hypergraph. In Proceedings of the 2013 International Conference on Applied Mathematics and Computational Methods, Venice Italy. 187--194.
[10]
Austin R. Benson, Rediet Abebe, Michael T. Schaub, Ali Jadbabaie, and Jon Kleinberg. 2018. Simplicial closure and higher-order link prediction. Proceedings of the National Academy of Sciences (2018). https://doi.org/10.1073/pnas.1800683115
[11]
Pavel Berkhin. 2006. Bookmark-coloring algorithm for personalized pagerank computing. Internet Mathematics, Vol. 3, 1 (2006), 41--62.
[12]
Sergey Brin and Lawrence Page. 1998. The anatomy of a large-scale hypertextual web search engine. Computer networks and ISDN systems, Vol. 30, 1--7 (1998), 107--117.
[13]
Timoteo Carletti, Federico Battiston, Giulia Cencetti, and Duccio Fanelli. 2020. Random walks on hypergraphs. Physical review E, Vol. 101, 2 (2020), 022308.
[14]
Timoteo Carletti, Duccio Fanelli, and Renaud Lambiotte. 2021. Random walks and community detection in hypergraphs. Journal of Physics: Complexity, Vol. 2, 1 (2021), 015011.
[15]
Umit Catalyurek and Cevdet Aykanat. 1999. Hypergraph-Partitioning-Based Decomposition for Parallel Sparse-Matrix Vector Multiplication. IEEE TPDS, Vol. 10, 7 (1999), 673--693.
[16]
Ümit V cCatalyürek and Cevdet Aykanat. 2011. PaToH (Partitioning Tool for Hypergraphs). https://faculty.cc.gatech.edu/ umit/PaToH/manual.pdf.
[17]
Tanmoy Chakraborty, Ayushi Dalmia, Animesh Mukherjee, and Niloy Ganguly. 2016. Metrics for Community Analysis: A Survey. arXiv:1604.03512 [cs.SI].
[18]
T-H Hubert Chan, Anand Louis, Zhihao Gavin Tang, and Chenzi Zhang. 2018. Spectral properties of hypergraph Laplacian and approximation algorithms. Journal of the ACM (JACM), Vol. 65, 3 (2018), 1--48.
[19]
Uthsav Chitra and Benjamin Raphael. 2019. Random walks on hypergraphs with edge-dependent vertex weights. In International conference on machine learning. PMLR, 1172--1181.
[20]
Jaewan Chun, Geon Lee, Kijung Shin, and Jinhong Jung. 2023. Random walk with restart on hypergraphs: fast computation and an application to anomaly detection. Data Mining and Knowledge Discovery (2023), 1--36.
[21]
Ioana Dumitriu and Yizhe Zhu. 2021. Spectra of Random Regular Hypergraphs. Electron. J. Comb., Vol. 28, 3 (2021).
[22]
Pedro F Felzenszwalb and Daniel P Huttenlocher. 2004. Efficient graph-based image segmentation. International journal of computer vision, Vol. 59 (2004), 167--181.
[23]
Keqin Feng and Wen-Ching Winnie Li. 1996. Spectra of hypergraphs and applications. Journal of number theory, Vol. 60, 1 (1996), 1--22.
[24]
Dániel Fogaras, Balázs Rácz, Károly Csalogány, and Tamás Sarlós. 2005. Towards scaling fully personalized pagerank: Algorithms, lower bounds, and experiments. Internet Mathematics, Vol. 2, 3 (2005), 333--358.
[25]
Santo Fortunato. 2010. Community detection in graphs. Physics reports, Vol. 486, 3--5 (2010), 75--174.
[26]
Lars Gottesbüren, Michael Hamann, Sebastian Schlag, and Dorothea Wagner. 2020. Advanced flow-based multilevel hypergraph partitioning. arXiv preprint arXiv:2003.12110 (2020).
[27]
Lars Gottesbüren, Tobias Heuer, and Peter Sanders. 2022. Parallel flow-based hypergraph partitioning. arXiv preprint arXiv:2201.01556 (2022).
[28]
Lars Gottesbüren, Tobias Heuer, Peter Sanders, and Sebastian Schlag. 2021. Scalable Shared-Memory Hypergraph Partitioning. In 2021 Proceedings of the Workshop on Algorithm Engineering and Experiments (ALENEX). SIAM, 16--30.
[29]
Taher H Haveliwala. 2002. Topic-sensitive pagerank. In Proceedings of the 11th Int'l Conf. on World Wide Web. 517--526.
[30]
Koby Hayashi, Sinan G Aksoy, Cheong Hee Park, and Haesun Park. 2020. Hypergraph random walks, laplacians, and clustering. In Proceedings of the 29th acm international conference on information & knowledge management. 495--504.
[31]
TaeHyun Hwang, Ze Tian, Rui Kuangy, and Jean-Pierre Kocher. 2008. Learning on weighted hypergraphs to integrate protein interactions and gene expressions for cancer outcome prediction. In 2008 Eighth IEEE International Conference on Data Mining. IEEE, 293--302.
[32]
Glen Jeh and Jennifer Widom. 2003. Scaling personalized web search. In Proceedings of the 12th international conference on World Wide Web. 271--279.
[33]
Jinhong Jung, Woojeong Jin, Lee Sael, and U Kang. 2016. Personalized ranking in signed networks using signed random walk with restart. In 2016 IEEE 16th International Conference on Data Mining (ICDM). IEEE, 973--978.
[34]
Sepandar D Kamvar, Taher H Haveliwala, Christopher Manning, and Gene H Golub. 2003. Exploiting the block structure of the web for computing pagerank. Technical Report. Technical report, Stanford University.
[35]
George Karypis, Rajat Aggarwal, Vipin Kumar, and Shashi Shekhar. 1997. Multilevel hypergraph partitioning: Application in VLSI domain. In Proceedings of the 34th annual Design Automation Conference. 526--529.
[36]
George Karypis and Vipin Kumar. 1998. A hypergraph partitioning package. Army HPC Research Center, Department of Computer Science & Engineering, University of Minnesota (1998).
[37]
Kyle Kloster and David F Gleich. 2014. Heat kernel based community detection. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. 1386--1395.
[38]
J. Leskovec, K. J. Lang, and M. Mahoney. 2010. Empirical comparison of algorithms for network community detection. In Proc. of the 19th Intl. Conf. on World Wide Web (WWW 2010). 631--640.
[39]
Xiaoming Li. 2018. Towards practical link prediction approaches in signed social networks. In Proceedings of the 26th Conference on User Modeling, Adaptation and Personalization. 269--272.
[40]
Peter Lofgren, Siddhartha Banerjee, and Ashish Goel. 2016. Personalized pagerank estimation and search: A bidirectional approach. In Proceedings of the Ninth ACM International Conference on Web Search and Data Mining. 163--172.
[41]
Peter A Lofgren, Siddhartha Banerjee, Ashish Goel, and Comandur Seshadhri. 2014. Fast-ppr: Scaling personalized pagerank estimation for large graphs. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. 1436--1445.
[42]
László Lovász and Miklós Simonovits. 1990. The mixing rate of Markov chains, an isoperimetric inequality, and computing the volume. In Proceedings [1990] 31st annual symposium on foundations of computer science. IEEE, 346--354.
[43]
László Lovász and Miklós Simonovits. 1993. Random walks in a convex body and an improved volume algorithm. Random structures & algorithms, Vol. 4, 4 (1993), 359--412.
[44]
Milena Mihail. 1989. Conductance and convergence of markov chains-a combinatorial treatment of expanders. In FOCS, Vol. 89. 526--531.
[45]
Raffaella Mulas, Christian Kuehn, Tobias Böhle, and Jürgen Jost. 2022. Random walks and Laplacians on hypergraphs: When do they match? Discrete Applied Mathematics, Vol. 317 (2022), 26--41.
[46]
Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1998. The pagerank citation ranking: Bring order to the web. Technical Report. Technical report, stanford University.
[47]
Arnab Sinha, Zhihong Shen, Yang Song, Hao Ma, Darrin Eide, Bo-June (Paul) Hsu, and Kuansan Wang. 2015. An Overview of Microsoft Academic Service (MAS) and Applications. In Proceedings of the 24th International Conference on World Wide Web. ACM Press. https://doi.org/10.1145/2740908.2742839
[48]
Daniel A Spielman and Shang-Hua Teng. 2013. A local clustering algorithm for massive graphs and its application to nearly linear time graph partitioning. SIAM Journal on computing, Vol. 42, 1 (2013), 1--26.
[49]
Yuuki Takai, Atsushi Miyauchi, Masahiro Ikeda, and Yuichi Yoshida. 2020. Hypergraph clustering based on pagerank. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1970--1978.
[50]
Shulong Tan, Jiajun Bu, Chun Chen, Bin Xu, Can Wang, and Xiaofei He. 2011. Using rich social media information for music recommendation via hypergraph model. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Vol. 7, 1 (2011), 1--22.
[51]
Jaewon Yang and Jure Leskovec. 2012. Defining and Evaluating Network Communities Based on Groundtruth. In Proceedings of the ACM SIGKDD Workshop on Mining Data Semantics (MDS '12). 3:1--3:8.
[52]
Hongyang Zhang, Peter Lofgren, and Ashish Goel. 2016. Approximate personalized pagerank on dynamic graphs. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 1315--1324.
[53]
Dengyong Zhou, Jiayuan Huang, and Bernhard Schölkopf. 2006. Learning with hypergraphs: Clustering, classification, and embedding. Advances in neural information processing systems, Vol. 19 (2006).
[54]
Jason Y Zien, Martine DF Schlag, and Pak K Chan. 1999. Multilevel spectral hypergraph partitioning with arbitrary vertex sizes. IEEE Transactions on computer-aided design of integrated circuits and systems, Vol. 18, 9 (1999), 1389--1399.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Management of Data
Proceedings of the ACM on Management of Data  Volume 2, Issue 4
SIGMOD
September 2024
458 pages
EISSN:2836-6573
DOI:10.1145/3698442
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 September 2024
Published in PACMMOD Volume 2, Issue 4

Permissions

Request permissions for this article.

Author Tags

  1. clustering
  2. collaboration networks
  3. graphs
  4. hypergraphs
  5. lovász-simonovits theorem.
  6. personalized pagerank
  7. social networks

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 62
    Total Downloads
  • Downloads (Last 12 months)62
  • Downloads (Last 6 weeks)62
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media