Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3442381.3449887acmconferencesArticle/Chapter ViewAbstractPublication PageswebconfConference Proceedingsconference-collections
research-article
Open access

Strongly Local Hypergraph Diffusions for Clustering and Semi-supervised Learning

Published: 03 June 2021 Publication History
  • Get Citation Alerts
  • Abstract

    Hypergraph-based machine learning methods are now widely recognized as important for modeling and using higher-order and multiway relationships between data objects. Local hypergraph clustering and semi-supervised learning specifically involve finding a well-connected set of nodes near a given set of labeled vertices. Although many methods for local graph clustering exist, there are relatively few for localized clustering in hypergraphs. Moreover, those that exist often lack flexibility to model a general class of hypergraph cut functions or cannot scale to large problems. To tackle these issues, this paper proposes a new diffusion-based hypergraph clustering algorithm that solves a quadratic hypergraph cut based objective akin to a hypergraph analog of Andersen-Chung-Lang personalized PageRank clustering for graphs. We prove that, for graphs with fixed maximum hyperedge size, this method is strongly local, meaning that its runtime only depends on the size of the output instead of the size of the hypergraph and is highly scalable. Moreover, our method enables us to compute with a wide variety of cardinality-based hypergraph cut functions. We also prove that the clusters found by solving the new objective function satisfy a Cheeger-like quality guarantee. We demonstrate that on large real-world hypergraphs our new method finds better clusters and runs much faster than existing approaches. Specifically, it runs in a few seconds for hypergraphs with a few million hyperedges compared with minutes for a flow-based technique. We furthermore show that our framework is general enough that can also be used to solve other p-norm based cut objectives on hypergraphs.

    References

    [1]
    Sameer Agarwal, Kristin Branson, and Serge Belongie. 2006. Higher Order Learning with Graphs. In ICML. 17–24.
    [2]
    Sameer Agarwal, Jongwoo Lim, Lihi Zelnik-Manor, Pietro Perona, David Kriegman, and Serge Belongie. 2005. Beyond Pairwise Clustering. In CVPR. 838–845.
    [3]
    Reid Andersen, Fan Chung, and Kevin Lang. 2006. Local graph partitioning using pagerank vectors. In FOCS. 475–486.
    [4]
    Reid Andersen and Kevin J. Lang. 2008. An Algorithm for Improving Graph Partitions. In SODA. 651–660.
    [5]
    Austin Benson, David F. Gleich, and Jure Leskovec. 2016. Higher-order organization of complex networks. Science 353(2016), 163–166.
    [6]
    Austin R Benson, Jon Kleinberg, and Nate Veldt. 2020. Augmented Sparsifiers for Generalized Hypergraph Cuts. arXiv preprint arXiv:2007.08075(2020).
    [7]
    Avrim Blum and Shuchi Chawla. 2001. Learning from Labeled and Unlabeled Data Using Graph Mincuts. In ICML. 19–26.
    [8]
    Uthsav Chitra and Benjamin J. Raphael. 2019. Random Walks on Hypergraphs with Edge-Dependent Vertex Weights. In ICML. 1172–1181.
    [9]
    Fan R. L. Chung. 1992. Spectral Graph Theory. American Mathematical Society.
    [10]
    D. Eckles, B. Karrer, and J. Ugander. 2017. Design and Analysis of Experiments in Networks: Reducing Bias from Interference. J. Causal Inference 5(2017).
    [11]
    K. Fountoulakis, M. Liu, D. F. Gleich, and M. W. Mahoney. 2020. Flow-based Algorithms for Improving Clusters: A Unifying Framework, Software, and Performance. arXiv cs.LG(2020), 2004.09608.
    [12]
    David Gleich and Michael Mahoney. 2014. Anti-differentiating approximation algorithms: A case study with min-cuts, spectral, and flow. In ICML. 1018–1025.
    [13]
    David F. Gleich and Michael W. Mahoney. 2015. Using Local Spectral Methods to Robustify Graph-Based Learning Algorithms. In SIGKDD. 359–368.
    [14]
    Scott W. Hadley. 1995. Approximation techniques for hypergraph partitioning problems. Discrete Applied Mathematics 59, 2 (1995), 115 – 127.
    [15]
    M. Hein, S. Setzer, L. Jost, and S. S. Rangapuram. 2013. The Total Variation on Hypergraphs - Learning on Hypergraphs Revisited. In NeurIPS. 2427–2435.
    [16]
    Rania Ibrahim and David F. Gleich. 2019. Nonlinear Diffusion for Community Detection and Semi-Supervised Learning. In The World Wide Web Conference (San Francisco, CA, USA) (WWW ’19). ACM, New York, NY, USA, 739–750.
    [17]
    Rania Ibrahim and David F. Gleich. 2020. Local Hypergraph Clustering using Capacity Releasing Diffusion. arXiv cs.SI(2020), 2003.04213.
    [18]
    E. Ihler, D. Wagner, and F. Wagner. 1993. Modeling hypergraphs by graphs with the same mincut properties. Inform. Process. Lett. 45 (1993), 171–175.
    [19]
    T. Joachims. 2003. Transductive learning via spectral graph partitioning. In ICML. 290–297.
    [20]
    G. Karypis, R. Aggarwal, V. Kumar, and S. Shekhar. 1999. Multilevel hypergraph partitioning: applications in VLSI domain. VLSI 7, 1 (March 1999), 69–79.
    [21]
    K. Lang. 2005. Fixing two weaknesses of the spectral method. In NeurIPS. 715–722.
    [22]
    E. L. Lawler. 1973. Cutsets and partitions of hypergraphs. Networks 3, 3 (1973), 275–285.
    [23]
    D. Lawlor, T. Budavári, and M. W Mahoney. 2016. Mapping the Similarities of Spectra: Global and Locally-biased Approaches to SDSS Galaxies. Astrophys. J. 833, 1 (2016), 26.
    [24]
    J. Leskovec, K. J. Lang, A. Dasgupta, and M. W. Mahoney. 2009. Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters. Internet Math. 6, 1 (2009), 29–123.
    [25]
    Pan Li, Niao He, and Olgica Milenkovic. 2020. Quadratic Decomposable Submodular Function Minimization: Theory and Practice. JMLR 21(2020), 1–49.
    [26]
    Pan Li and Olgica Milenkovic. 2017. Inhomogeneous Hypergraph Clustering with Applications. In NeurIPS. 2308–2318.
    [27]
    Pan Li and Olgica Milenkovic. 2018. Submodular Hypergraphs: p-Laplacians, Cheeger Inequalities and Spectral Clustering. In ICML, Vol. 80. 3014–3023.
    [28]
    Meng Liu and David F. Gleich. 2020. Strongly local p-norm-cut algorithms for semi-supervised learning and local graph clustering. arxiv:2006.08569 [cs.SI]
    [29]
    M. W. Mahoney, L. Orecchia, and N. K. Vishnoi. 2012. A Local Spectral Method for Graphs: With Applications to Improving Graph Partitions and Exploring Data Graphs Locally. JMLR 13(2012), 2339–2365.
    [30]
    Jianmo Ni, Jiacheng Li, and Julian McAuley. 2019. Justifying Recommendations using Distantly-Labeled Reviews and Fine-Grained Aspects. In EMNLP-IJCNLP.
    [31]
    Yuuki Takai, Atsushi Miyauchi, Masahiro Ikeda, and Yuichi Yoshida. 2020. Hypergraph Clustering Based on PageRank. In KDD. 1970–1978.
    [32]
    Nate Veldt, Austin R. Benson, and Jon Kleinberg. 2020. Hypergraph Cuts with General Splitting Functions. arxiv:2001.02817 [cs.DS]
    [33]
    Nate Veldt, Austin R Benson, and Jon Kleinberg. 2020. Minimizing Localized Ratio Cut Objectives in Hypergraphs. In KDD. 1708–1718.
    [34]
    Nate Veldt, David F. Gleich, and Michael W. Mahoney. 2016. A Simple and Strongly-Local Flow-Based Method for Cut Improvement. In ICML. 1938–1947.
    [35]
    Nate Veldt, Christine Klymko, and David F. Gleich. 2019. Flow-Based Local Graph Clustering with Better Seed Set Inclusion. In SDM. 378–386.
    [36]
    D. Wang, K. Fountoulakis, M. Henzinger, M. W. Mahoney, and S. Rao. 2017. Capacity releasing diffusion for speed and locality. In ICML. 3598–3607.
    [37]
    Shenghao Yang, Di Wang, and Kimon Fountoulakis. 2020. p-Norm Flow Diffusion for Local Graph Clustering. arXiv preprint arXiv:2005.09810(2020).
    [38]
    Hao Yin, Austin R. Benson, Jure Leskovec, and David F. Gleich. 2017. Local Higher-Order Graph Clustering. In KDD. 555–564.
    [39]
    Yuichi Yoshida. 2016. Nonlinear Laplacian for digraphs and its applications to network analysis. In WSDM. 483–492.
    [40]
    Yuichi Yoshida. 2019. Cheeger Inequalities for Submodular Transformations. In SODA. 2582–2601.
    [41]
    Chenzi Zhang, Shuguang Hu, Zhihao Gavin Tang, and T-H. Hubert Chan. 2017. Re-Revisiting Learning on Hypergraphs: Confidence Interval and Subgradient Method. In ICML. 4026–4034.
    [42]
    Dengyong Zhou, Olivier Bousquet, Thomas Navin Lal, Jason Weston, and Bernhard Schölkopf. 2003. Learning with Local and Global Consistency. In NIPS.
    [43]
    Dengyong Zhou, Jiayuan Huang, and Bernhard Schölkopf. 2006. Learning with Hypergraphs: Clustering, Classification, and Embedding. In NeurIPS. 1601–1608.
    [44]
    Xiaojin Zhu, Zoubin Ghahramani, and John Lafferty. 2003. Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions. In ICML. 912–919.
    [45]
    Zeyuan Allen Zhu, Silvio Lattanzi, and Vahab S Mirrokni. 2013. A Local Algorithm for Finding Well-Connected Clusters. In ICML (3). 396–404.
    [46]
    J. Y. Zien, M. D. F. Schlag, and P. K. Chan. 1999. Multilevel spectral hypergraph partitioning with arbitrary vertex sizes. IEEE TCAD 18, 9 (1999), 1389–1399.

    Cited By

    View all
    • (2024)Learning the effective order of a hypergraph dynamical systemScience Advances10.1126/sciadv.adh405310:19Online publication date: 10-May-2024
    • (2024)Penalized Flow Hypergraph Local ClusteringIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.331901936:5(2110-2125)Online publication date: May-2024
    • (2023)Modularity-based Hypergraph Clustering: Random Hypergraph Model, Hyperedge-cluster Relation, and ComputationProceedings of the ACM on Management of Data10.1145/36173351:3(1-25)Online publication date: 13-Nov-2023
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    WWW '21: Proceedings of the Web Conference 2021
    April 2021
    4054 pages
    ISBN:9781450383127
    DOI:10.1145/3442381
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 03 June 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. PageRank
    2. community detection
    3. hypergraph
    4. local clustering

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    WWW '21
    Sponsor:
    WWW '21: The Web Conference 2021
    April 19 - 23, 2021
    Ljubljana, Slovenia

    Acceptance Rates

    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)380
    • Downloads (Last 6 weeks)31

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Learning the effective order of a hypergraph dynamical systemScience Advances10.1126/sciadv.adh405310:19Online publication date: 10-May-2024
    • (2024)Penalized Flow Hypergraph Local ClusteringIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.331901936:5(2110-2125)Online publication date: May-2024
    • (2023)Modularity-based Hypergraph Clustering: Random Hypergraph Model, Hyperedge-cluster Relation, and ComputationProceedings of the ACM on Management of Data10.1145/36173351:3(1-25)Online publication date: 13-Nov-2023
    • (2023)Hypergraph Neural Networks for Time-series Forecasting2023 IEEE International Conference on Big Data (BigData)10.1109/BigData59044.2023.10386109(1076-1080)Online publication date: 15-Dec-2023
    • (2023)Semi-supervised and un-supervised clusteringInformation Systems10.1016/j.is.2023.102178114:COnline publication date: 1-Mar-2023
    • (2023)A flexible PageRank-based graph embedding framework closely related to spectral eigenvector embeddingsJournal of Applied and Computational Topology10.1007/s41468-023-00129-6Online publication date: 21-Jul-2023
    • (2023)Large Scale Hypergraph ComputationHypergraph Computation10.1007/978-981-99-0185-2_8(145-157)Online publication date: 17-Jan-2023
    • (2022)Hypergraph Cuts with General Splitting FunctionsSIAM Review10.1137/20M132104864:3(650-685)Online publication date: 1-Jan-2022
    • (2022)Hypergraph cuts with edge-dependent vertex weightsApplied Network Science10.1007/s41109-022-00483-x7:1Online publication date: 5-Jul-2022
    • (2021)Hyperedge Prediction Using Tensor Eigenvalue DecompositionJournal of the Indian Institute of Science10.1007/s41745-021-00225-5101:3(443-453)Online publication date: 21-Jul-2021

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media