Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3308558.3313471acmotherconferencesArticle/Chapter ViewAbstractPublication PageswebconfConference Proceedingsconference-collections
research-article

Learning Resolution Parameters for Graph Clustering

Published: 13 May 2019 Publication History
  • Get Citation Alerts
  • Abstract

    Finding clusters of well-connected nodes in a graph is an extensively studied problem in graph-based data analysis. Because of its many applications, a large number of distinct graph clustering objective functions and algorithms have already been proposed and analyzed. To aid practitioners in determining the best clustering approach to use in different applications, we present new techniques for automatically learning how to set clustering resolution parameters. These parameters control the size and structure of communities that are formed by optimizing a generalized objective function. We begin by formalizing the notion of a parameter fitness function, which measures how well a fixed input clustering approximately solves a generalized clustering objective for a specific resolution parameter value. Under reasonable assumptions, which suit two key graph clustering applications, such a parameter fitness function can be efficiently minimized using a bisection-like method, yielding a resolution parameter that fits well with the example clustering. We view our framework as a type of single-shot hyperparameter tuning, as we are able to learn a good resolution parameter with just a single example. Our general approach can be applied to learn resolution parameters for both local and global graph clustering objectives. We demonstrate its utility in several experiments on real-world data where it is helpful to learn resolution parameters from a given example clustering.

    References

    [1]
    Ilan Adler and Renato D.C. Monteiro. 1992. A geometric view of parametric linear programming. Algorithmica8(1992), 161-176.
    [2]
    Nir Ailon, Moses Charikar, and Alantha Newman. 2008. Aggregating inconsistent information: ranking and clustering. Journal of the ACM (JACM)55, 5 (2008), 23.
    [3]
    Reid Andersen, Fan Chung, and Kevin Lang. 2006. Local Graph Partitioning using PageRank Vectors. In Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science. http://www.math.ucsd.edu/~fan/wp/localpartition.pdf
    [4]
    Reid Andersen and Kevin Lang. 2008. An Algorithm for Improving Graph Partitions. In Proceedings of the 19th annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2008). 651-660.
    [5]
    Nikhil Bansal, Avrim Blum, and Shuchi Chawla. 2004. Correlation Clustering. Machine Learning56(2004), 89-113.
    [6]
    Austin R. Benson, David F. Gleich, and Jure Leskovec. 2016. Higher-order organization of complex networks. Science353, 6295 (2016), 163-166. arXiv:http://science.sciencemag.org/content/353/6295/163.full.pdf
    [7]
    Vincent D Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. 2008. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment2008, 10(2008), P10008. http://stacks.iop.org/1742-5468/2008/i=10/a=P10008
    [8]
    Moses Charikar, Venkatesan Guruswami, and Anthony Wirth. 2005. Clustering with qualitative information. J. Comput. System Sci.71, 3 (2005), 360 - 383. Learning Theory 2003.
    [9]
    Shuchi Chawla, Konstantin Makarychev, Tselil Schramm, and Grigory Yaroslavtsev. 2015. Near optimal LP rounding algorithm for correlation clustering on complete and complete k-partite graphs. In Proceedings of the Forty-Seventh Annual ACM on Symposium on Theory of Computing. ACM, 219-228.
    [10]
    J.-C. Delvenne, Sophia N Yaliraki, and Mauricio Barahona. 2010. Stability of graph communities across time scales. Proceedings of the National Academy of Sciences107, 29(2010), 12755-12760.
    [11]
    Santo Fortunato. 2010. Community detection in graphs. Physics Reports486, 3 (2010), 75 - 174.
    [12]
    Santo Fortunato and Marc Barthe´lemy. 2007. Resolution limit in community detection. Proceedings of the National Academy of Sciences104, 1 (2007), 36-41. arXiv:https://www.pnas.org/content/104/1/36.full.pdf
    [13]
    Santo Fortunato and Darko Hric. 2016. Community detection in networks: A user guide. Physics Reports659(2016), 1 - 44. Community detection in networks: A user guide.
    [14]
    K. Fountoulakis, D. F. Gleich, and M. W. Mahoney. 2017. An Optimization Approach to Locally-Biased Graph Algorithms. Proc. IEEE105, 2 (Feb 2017), 256-272.
    [15]
    David F. Gleich, Nate Veldt, and Anthony Wirth. 2018. Correlation Clustering Generalized. In 29th International Symposium on Algorithms and Computation(ISAAC 2018), Vol. 123. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, 44:1-44:13.
    [16]
    Lucas G. S. Jeub, Marya Bazzi, Inderjit S. Jutla, and Peter J. Mucha. 2011-2017. A generalized Louvain method for community detection implemented in MATLAB. http://netwiki.amath.unc.edu/GenLouvain
    [17]
    Lucas G. S. Jeub, Olaf Sporns, and Santo Fortunato. 2018. Multiresolution Consensus Clustering in Networks. Scientific Reports8, 1 (2018), 3259.
    [18]
    Kyle Kloster and David F. Gleich. 2014. Heat Kernel Based Community Detection. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(KDD 2014). 1386-1395.
    [19]
    Andrea Lancichinetti and Santo Fortunato. 2009. Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. Phys. Rev. E80 (Jul 2009), 016118. Issue 1.
    [20]
    Kevin Lang and Satish Rao. 2004. A Flow-Based Method for Improving the Expansion or Conductance of Graph Cuts. In Integer Programming and Combinatorial Optimization. Lecture Notes in Computer Science, Vol. 3064. Springer Berlin Heidelberg, 325-337.
    [21]
    Jure Leskovec, Jon Kleinberg, and Christos Faloutsos. 2007. Graph Evolution: Densification and Shrinking Diameters. ACM Trans. Knowl. Discov. Data1, 1, Article 2 (March 2007).
    [22]
    Jure Leskovec and Andrej Krevl. 2014. SNAP Datasets: Stanford Large Network Dataset Collection. http://snap.stanford.edu/data.
    [23]
    Mark EJ Newman. 2006. Finding community structure in networks using the eigenvectors of matrices. Physical review E74, 3 (2006), 036104.
    [24]
    Mark EJ Newman and Michelle Girvan. 2004. Finding and evaluating community structure in networks. Physical review E69, 026113 (2004).
    [25]
    M. E. J. Newman. 2016. Equivalence between modularity optimization and maximum likelihood methods for community detection. Phys. Rev. E94 (Nov 2016), 052315. Issue 5.
    [26]
    Lorenzo Orecchia and Zeyuan Allen Zhu. 2014. Flow-Based Algorithms for Local Graph Clustering. In Proceedings of the 25th ACM-SIAM Symposium on Discrete Algorithms(SODA 2014). 1267-1286. http://arxiv.org/abs/1307.2855
    [27]
    Leto Peel, Daniel B. Larremore, and Aaron Clauset. 2017. The ground truth about metadata and community detection in networks. Science Advances3, 5 (2017). arXiv:http://advances.sciencemag.org/content/3/5/e1602548.full.pdf
    [28]
    Mason A Porter, Jukka-Pekka Onnela, and Peter J Mucha. 2009. Communities in networks. Notices of the AMS56, 9 (2009), 1082-1097.
    [29]
    Jörg Reichardt and Stefan Bornholdt. 2006. Statistical mechanics of community detection. Physical Review E74, 016110 (2006).
    [30]
    Cameron Ruggles, Nate Veldt, and David Gleich. 2019. A Parallel Projection Method for Metric Constrained Optimization. arXiv preprint arXiv:1901.10084(2019).
    [31]
    Satu Elisa Schaeffer. 2007. Graph clustering. Computer Science Review1, 1 (2007), 27 - 64.
    [32]
    Michael T. Schaub, Renaud Lambiotte, and Mauricio Barahona. 2012. Encoding dynamics for multiscale community detection: Markov time sweeping for the map equation. Phys. Rev. E86 (Aug 2012), 026112. Issue 2.
    [33]
    Daniel A. Spielman and Shang-Hua Teng. 2013. A Local Clustering Algorithm for Massive Graphs and Its Application to Nearly Linear Time Graph Partitioning. SIAM J. Comput.42, 1 (2013), 1-26.
    [34]
    Amanda L. Traud, Peter J. Mucha, and Mason A. Porter. 2012. Social structure of Facebook networks. Physica A: Statistical Mechanics and its Applications391, 16(2012), 4165-4180.
    [35]
    Nate Veldt, David Gleich, and Michael Mahoney. 2016. A Simple and Strongly-Local Flow-Based Method for Cut Improvement. In Proceedings of The 33rd International Conference on Machine Learning(ICML 2016), Vol. 48. PMLR, New York, New York, USA, 1938-1947. http://proceedings.mlr.press/v48/veldt16.html
    [36]
    Nate Veldt, David Gleich, Anthony Wirth, and James Saunderson. 2018. A Projection Method for Metric-Constrained Optimization. arXiv preprint arXiv:1806.01678(2018).
    [37]
    Nate Veldt, David F. Gleich, and Anthony Wirth. 2018. A Correlation Clustering Framework for Community Detection. In Proceedings of the 2018 World Wide Web Conference(WWW 2018). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 439-448.
    [38]
    Nate Veldt, Christine Klymko, and David F. Gleich. 2019. Flow-Based Local Graph Clustering with Better Seed Set Inclusion. In Proceedings of the 2019 SIAM International Conference on Data Mining.
    [39]
    Di Wang, Kimon Fountoulakis, Monika Henzinger, Michael W. Mahoney, and Satish Rao. 2017. Capacity Releasing Diffusion for Speed and Locality. In Proceedings of the 34th International Conference on Machine Learning(ICML 2017), Vol. 70. PMLR, International Convention Centre, Sydney, Australia, 3598-3607. http://proceedings.mlr.press/v70/wang17b.html
    [40]
    Jaewon Yang and Jure Leskovec. 2015. Defining and evaluating network communities based on ground-truth. Knowledge and Information Systems42, 1 (01 Jan 2015), 181-213.
    [41]
    Hao Yin, Austin R. Benson, Jure Leskovec, and David F. Gleich. 2017. Local Higher-Order Graph Clustering. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(KDD 2017). ACM, New York, NY, USA, 555-564.
    [42]
    Linbin Yu and Chris Ding. 2010. Network Community Discovery: Solving Modularity Clustering via Normalized Cut. In Proceedings of the Eighth Workshop on Mining and Learning with Graphs(MLG 2010). ACM, New York, NY, USA, 34-36.

    Cited By

    View all
    • (2023)Faster Approximation Algorithms for Parameterized Graph Clustering and Edge LabelingProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3614878(78-87)Online publication date: 21-Oct-2023
    • (2023)Flow-Based Algorithms for Improving Clusters: A Unifying Framework, Software, and PerformanceSIAM Review10.1137/20M133305565:1(59-143)Online publication date: 9-Feb-2023
    • (2022)K-cluster combinatorial optimization problems is NP_Hardness problem in graph clusteringPROCEEDING OF THE 1ST INTERNATIONAL CONFERENCE ON ADVANCED RESEARCH IN PURE AND APPLIED SCIENCE (ICARPAS2021): Third Annual Conference of Al-Muthanna University/College of Science10.1063/5.0093394(060034)Online publication date: 2022
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    WWW '19: The World Wide Web Conference
    May 2019
    3620 pages
    ISBN:9781450366748
    DOI:10.1145/3308558
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    In-Cooperation

    • IW3C2: International World Wide Web Conference Committee

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 May 2019

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Graph clustering
    2. community detection
    3. resolution parameters

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    WWW '19
    WWW '19: The Web Conference
    May 13 - 17, 2019
    CA, San Francisco, USA

    Acceptance Rates

    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)16
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 11 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Faster Approximation Algorithms for Parameterized Graph Clustering and Edge LabelingProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3614878(78-87)Online publication date: 21-Oct-2023
    • (2023)Flow-Based Algorithms for Improving Clusters: A Unifying Framework, Software, and PerformanceSIAM Review10.1137/20M133305565:1(59-143)Online publication date: 9-Feb-2023
    • (2022)K-cluster combinatorial optimization problems is NP_Hardness problem in graph clusteringPROCEEDING OF THE 1ST INTERNATIONAL CONFERENCE ON ADVANCED RESEARCH IN PURE AND APPLIED SCIENCE (ICARPAS2021): Third Annual Conference of Al-Muthanna University/College of Science10.1063/5.0093394(060034)Online publication date: 2022
    • (2022)Community detection over feature-rich information networksInformation Systems10.1016/j.is.2022.102092109:COnline publication date: 1-Nov-2022
    • (2021)Digraph Clustering by the BlueRed Method2021 IEEE High Performance Extreme Computing Conference (HPEC)10.1109/HPEC49654.2021.9622834(1-7)Online publication date: 20-Sep-2021
    • (2020)Strongly local p-norm-cut algorithms for semi-supervised learning and local graph clusteringProceedings of the 34th International Conference on Neural Information Processing Systems10.5555/3495724.3496146(5023-5035)Online publication date: 6-Dec-2020

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media