Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3394486.3403238acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Public Access

Parameterized Correlation Clustering in Hypergraphs and Bipartite Graphs

Published: 20 August 2020 Publication History
  • Get Citation Alerts
  • Abstract

    Motivated by applications in community detection and dense subgraph discovery, we consider new clustering objectives in hypergraphs and bipartite graphs. These objectives are parameterized by one or more resolution parameters in order to enable diverse knowledge discovery in complex data.
    For both hypergraph and bipartite objectives, we identify relevant parameter regimes that are equivalent to existing objectives and share their (polynomial-time) approximation algorithms. We first show that our parameterized hypergraph correlation clustering objective is related to higher-order notions of normalized cut and modularity in hypergraphs. It is further amenable to approximation algorithms via hyperedge expansion techniques.
    Our parameterized bipartite correlation clustering objective generalizes standard unweighted bipartite correlation clustering, as well as the bicluster deletion problem. For a certain choice of parameters it is also related to our hypergraph objective. Although in general it is NP-hard, we highlight a parameter regime for the bipartite objective where the problem reduces to the bipartite matching problem and thus can be solved in polynomial time. For other parameter settings, we present several approximation algorithms using linear program rounding techniques. These results allow us to introduce the first constant-factor approximation for bicluster deletion, the task of removing a minimum number of edges to partition a bipartite graph into disjoint bi-cliques.
    In several experimental results, we highlight the flexibility of our framework and the diversity of results that can be obtained in different parameter settings. This includes clustering bipartite graphs across a range of parameters, detecting motif-rich clusters in an email network and a food web, and forming clusters of retail products in a product review hypergraph, that are highly correlated with known product categories.

    References

    [1]
    Sameer Agarwal, Jongwoo Lim, Lihi Zelnik-Manor, Pietro Perona, David Kriegman, and Serge Belongie. 2005. Beyond Pairwise Clustering (CVPR '05).
    [2]
    Nir. Ailon, Noa. Avigdor-Elgrabli, Edo. Liberty, and Anke. van Zuylen. 2012. Improved Approximation Algorithms for Bipartite Correlation Clustering. SIAM J. Comput., Vol. 41, 5 (2012), 1110--1121.
    [3]
    Nir Ailon, Moses Charikar, and Alantha Newman. 2008. Aggregating inconsistent information: ranking and clustering. Journal of the ACM (JACM), Vol. 55, 5 (2008), 23.
    [4]
    Ilya Amburg, Nate Veldt, and Austin R Benson. Clustering in graphs and hypergraphs with categorical edge labels (WWW '20).
    [5]
    Noga Amit. 2004. The bicluster graph editing problem. Master's thesis. Tel Aviv University.
    [6]
    A Arenas, A Ferná ndez, S Fortunato, and S Gó mez. 2008b. Motif-based communities in complex networks. Journal of Physics A: Mathematical and Theoretical, Vol. 41, 22 (2008).
    [7]
    A Arenas, A Ferná ndez, and S Gó mez. 2008a. Analysis of the structure of complex networks at different resolution levels. New Journal of Physics, Vol. 10, 5 (2008).
    [8]
    M. Asteris, A. Kyrillidis, D. Papailiopoulos, and A. Dimakis. Bipartite correlation clustering: Maximizing agreements (AISTATS '16).
    [9]
    Nikhil Bansal, Avrim Blum, and Shuchi Chawla. 2004. Correlation Clustering. Machine Learning, Vol. 56 (2004), 89--113.
    [10]
    Austin R. Benson, David F. Gleich, and Jure Leskovec. 2016. Higher-order organization of complex networks. Science, Vol. 353, 6295 (2016), 163--166.
    [11]
    Vincent D Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. 2008. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, Vol. 2008, 10 (2008), P10008.
    [12]
    Justin Brickell, Inderjit S. Dhillon, Suvrit Sra, and Joel A. Tropp. 2008. The Metric Nearness Problem. SIAM J. Matrix Anal. Appl., Vol. 30, 1 (2008), 375--396.
    [13]
    Ü mit V. cC atalyü rek and Cevdet Aykanat. 1999. Hypergraph-Partitioning Based Decomposition for Parallel Sparse-Matrix Vector Multiplication. IEEE Transactions on Parallel and Distributed Systems, Vol. 10, 7 (1999), 673--693.
    [14]
    Moses Charikar, Venkatesan Guruswami, and Anthony Wirth. 2005. Clustering with qualitative information. J. Comput. System Sci., Vol. 71, 3 (2005), 360 -- 383. Learning Theory 2003.
    [15]
    Shuchi Chawla, Konstantin Makarychev, Tselil Schramm, and Grigory Yaroslavtsev. 2015. Near optimal LP rounding algorithm for correlation clustering on complete and complete k-partite graphs (STOC '15). ACM.
    [16]
    J.-C. Delvenne, S. N. Yaliraki, and M. Barahona. 2010. Stability of graph communities across time scales. Proceedings of the National Academy of Sciences, Vol. 107, 29 (2010), 12755--12760.
    [17]
    Erik D. Demaine, Dotan Emanuel, Amos Fiat, and Nicole Immorlica. 2006. Correlation clustering in general weighted graphs. Theoretical Computer Science, Vol. 361, 2 (2006), 172 -- 187. Approximation and Online Algorithms.
    [18]
    Inderjit S. Dhillon, Yuqiang Guan, and Brian Kulis. 2007. Weighted Graph Cuts without Eigenvectors A Multilevel Approach. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 29, 11 (2007), 1944--1957.
    [19]
    Santo Fortunato and Marc Barthélemy. 2007. Resolution limit in community detection. Proceedings of the National Academy of Sciences, Vol. 104, 1 (2007), 36--41.
    [20]
    Takuro Fukunaga. 2018. LP-Based Pivoting Algorithm for Higher-Order Correlation Clustering. In Computing and Combinatorics .
    [21]
    David F. Gleich, Nate Veldt, and Anthony Wirth. 2018. Correlation Clustering Generalized (ISAAC 2018).
    [22]
    J. Gong and Sung Kyu Lim. 1998. Multiway partitioning with pairwise movement (ICAD '98).
    [23]
    S. W. Hadley, B. L. Mark, and A. Vannelli. 1992. An efficient eigenvector approach for finding netlist partitions. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Vol. 11, 7 (1992).
    [24]
    Matthias Hein, Simon Setzer, Leonardo Jost, and Syama Sundar Rangapuram. 2013. The Total Variation on Hypergraphs - Learning on Hypergraphs Revisited (NIPS'13).
    [25]
    Edmund Ihler, Dorothea Wagner, and Frank Wagner. 1993. Modeling Hypergraphs by Graphs with the Same Mincut Properties. Inf. Process. Lett., Vol. 45, 4 (1993).
    [26]
    Lucas G. S. Jeub, Marya Bazzi, Inderjit S. Jutla, and Peter J. Mucha. 2011--2017. A generalized Louvain method for community detection implemented in MATLAB. (2011--2017). http://netwiki.amath.unc.edu/GenLouvain
    [27]
    Bogumił Kami'nski, Valérie Poulin, Paweł Prałat, Przemysław Szufel, and Francc ois Théberge. 2019. Clustering via hypergraph modularity. PloS one, Vol. 14, 11 (2019).
    [28]
    George Karypis and Vipin Kumar. 1998. A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs. SIAM J. Sci. Comput., Vol. 20, 1 (1998), 359--392.
    [29]
    George Karypis and Vipin Kumar. 1999. Multilevel K-way Hypergraph Partitioning (DAC '99). ACM, 343--348.
    [30]
    Sungwoong Kim, Sebastian Nowozin, Pushmeet Kohli, and Chang D. Yoo. 2011. Higher-Order Correlation Clustering for Image Segmentation (NIPS '11).
    [31]
    Christine Klymko, David F. Gleich, and Tamara G. Kolda. 2014. Using Triangles to Improve Community Detection in Directed Networks. In The Second ASE International Conference on Big Data Science and Computing, BigDataScience .
    [32]
    Tarun Kumar, Sankaran Vaidyanathan, Harini Ananthapadmanabhan, Srinivasan Parthasarathy, and Balaraman Ravindran. 2020. A New Measure of Modularity in Hypergraphs: Theoretical Insights and Implications for Effective Clustering. In Complex Networks and Their Applications VIII. Springer International Publishing.
    [33]
    Jure Leskovec, Jon Kleinberg, and Christos Faloutsos. 2007. Graph evolution: Densification and shrinking diameters. ACM Transactions on Knowledge Discovery from Data (TKDD), Vol. 1, 1 (2007), 2.
    [34]
    Pan Li, H. Dau, Gregory J. Puleo, and Olgica Milenkovic. 2017. Motif clustering and overlapping clustering for social network analysis (INFOCOM '17). 1--9.
    [35]
    Pan Li and Olgica Milenkovic. 2017. Inhomogeneous Hypergraph Clustering with Applications (NIPS '17). 2308--2318.
    [36]
    Pan Li and Olgica Milenkovic. 2018. Submodular Hypergraphs: p-Laplacians, Cheeger Inequalities and Spectral Clustering (ICML '18). 3020--3029.
    [37]
    Pan Li, Gregory. J. Puleo, and Olgica. Milenkovic. 2019. Motif and Hypergraph Correlation Clustering. IEEE Transactions on Information Theory (2019), 1--1.
    [38]
    Tom Michoel and Bruno Nachtergaele. 2012. Alignment and integration of complex networks by hypergraph-based spectral clustering. Physical Review E, Vol. 86 (2012), 056111. Issue 5.
    [39]
    Mark EJ Newman and Michelle Girvan. 2004. Finding and evaluating community structure in networks. Physical review E, Vol. 69, 026113 (2004).
    [40]
    Jianmo Ni, Jiacheng Li, and Julian McAuley. 2019. Justifying Recommendations using Distantly-Labeled Reviews and Fine-Grained Aspects (EMNLP-IJCNLP '19). 188--197.
    [41]
    Leto Peel, Daniel B. Larremore, and Aaron Clauset. 2017. The ground truth about metadata and community detection in networks. Science Advances, Vol. 3, 5 (2017).
    [42]
    Gregory. J. Puleo and Olgica. Milenkovic. 2018. Correlation Clustering and Biclustering With Locally Bounded Errors. IEEE Transactions on Information Theory, Vol. 64, 6 (June 2018), 4105--4119.
    [43]
    Jörg Reichardt and Stefan Bornholdt. 2004. Detecting Fuzzy Community Structures in Complex Networks with a Potts Model. Phys. Rev. Lett., Vol. 93 (2004), 218701.
    [44]
    Cameron Ruggles, Nate Veldt, and David F. Gleich. A Parallel Projection Method for Metric Constrained Optimization (SIAM CSC '20).
    [45]
    Satu Elisa Schaeffer. 2007. Graph clustering. Computer Science Review (2007).
    [46]
    Ron Shamir, Roded Sharan, and Dekel Tsur. 2004. Cluster graph modification problems. Discrete Applied Mathematics, Vol. 144 (2004), 173--182.
    [47]
    Jianbo Shi and J. Malik. 2000. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 22, 8 (2000), 888--905.
    [48]
    Rishi Sonthalia and Anna C. Gilbert. 2020. Project and Forget: Solving Large-Scale Metric Constrained Problems. (2020). arxiv: cs.LG/2005.03853
    [49]
    Ze Tian, TaeHyun Hwang, and Rui Kuang. 2009. A hypergraph-based learning algorithm for classifying gene expression and arrayCGH data with prior knowledge. Bioinformatics, Vol. 25, 21 (2009), 2831--2838.
    [50]
    V. A. Traag, P. Van Dooren, and Y. Nesterov. 2011. Narrow scope for resolution-limit-free community detection. Phys. Rev. E, Vol. 84 (Jul 2011), 016114. Issue 1.
    [51]
    Charalampos E. Tsourakakis, Jakub Pachocki, and Michael Mitzenmacher. 2017. Scalable Motif-aware Graph Clustering (WWW '17). 1451--1460.
    [52]
    Anke van Zuylen and David P. Williamson. 2009. Deterministic Pivoting Algorithms for Constrained Ranking and Clustering Problems. Mathematics of Operations Research, Vol. 34, 3 (2009), 594--620.
    [53]
    Nate Veldt, Austin R. Benson, and Jon Kleinberg. 2020. Hypergraph Cuts with General Splitting Functions. (2020). arxiv: cs.DS/2001.02817
    [54]
    Nate Veldt, David F. Gleich, and Anthony Wirth. 2018. A Correlation Clustering Framework for Community Detection (WWW '18). 439--448.
    [55]
    Nate Veldt, David F. Gleich, and Anthony Wirth. 2019 a. Learning Resolution Parameters for Graph Clustering (WWW '19).
    [56]
    Nate Veldt, David F. Gleich, Anthony Wirth, and James Saunderson. 2019 b. Metric-Constrained Optimization for Graph Clustering Algorithms. SIAM Journal on Mathematics of Data Science, Vol. 1, 2 (2019), 333--355.
    [57]
    Nate Veldt, Anthony Wirth, and David F. Gleich. 2020. Parameterized Correlation Clustering in Hypergraphs and Bipartite Graphs. (2020). arxiv: cs.DS/2002.09460
    [58]
    Hao Yin, Austin R. Benson, and Jure Leskovec. 2018. Higher-order clustering in networks. Phys. Rev. E, Vol. 97 (2018), 052306. Issue 5.
    [59]
    Hao Yin, Austin R Benson, Jure Leskovec, and David F Gleich. 2017. Local higher-order graph clustering (KDD '17). 555--564.
    [60]
    Dengyong Zhou, Jiayuan Huang, and Bernhard Schölkopf. 2006. Learning with Hypergraphs: Clustering, Classification, and Embedding (NIPS '06).
    [61]
    J. Y. Zien, M. D. F. Schlag, and P. K. Chan. 1999. Multilevel spectral hypergraph partitioning with arbitrary vertex sizes. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Vol. 18, 9 (1999), 1389--1399.

    Cited By

    View all
    • (2024)Legal hypergraphsPhilosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences10.1098/rsta.2023.0141382:2270Online publication date: 26-Feb-2024
    • (2023)Modularity-based Hypergraph Clustering: Random Hypergraph Model, Hyperedge-cluster Relation, and ComputationProceedings of the ACM on Management of Data10.1145/36173351:3(1-25)Online publication date: 13-Nov-2023
    • (2023)Faster Approximation Algorithms for Parameterized Graph Clustering and Edge LabelingProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3614878(78-87)Online publication date: 21-Oct-2023
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
    August 2020
    3664 pages
    ISBN:9781450379984
    DOI:10.1145/3394486
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 20 August 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. bipartite graphs
    2. correlation clustering
    3. hypergraphs

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    KDD '20
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Upcoming Conference

    KDD '24

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)192
    • Downloads (Last 6 weeks)20
    Reflects downloads up to 11 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Legal hypergraphsPhilosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences10.1098/rsta.2023.0141382:2270Online publication date: 26-Feb-2024
    • (2023)Modularity-based Hypergraph Clustering: Random Hypergraph Model, Hyperedge-cluster Relation, and ComputationProceedings of the ACM on Management of Data10.1145/36173351:3(1-25)Online publication date: 13-Nov-2023
    • (2023)Faster Approximation Algorithms for Parameterized Graph Clustering and Edge LabelingProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3614878(78-87)Online publication date: 21-Oct-2023
    • (2023)Squared Symmetric Formal Contexts and Their Connections with Correlation MatricesGraph-Based Representation and Reasoning10.1007/978-3-031-40960-8_2(19-27)Online publication date: 16-Aug-2023
    • (2022)Correlation ClusteringundefinedOnline publication date: 30-Apr-2022
    • (2021)Generative hypergraph clustering: From blockmodels to modularityScience Advances10.1126/sciadv.abh13037:28Online publication date: 9-Jul-2021
    • (2021)An Improved Approximation Algorithm for Capacitated Correlation Clustering ProblemCombinatorial Optimization and Applications10.1007/978-3-030-92681-6_4(35-45)Online publication date: 17-Dec-2021

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media