Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Clustering in Hypergraphs to Minimize Average Edge Service Time

Published: 06 June 2020 Publication History

Abstract

We study the problem of clustering the vertices of a weighted hypergraph such that on average the vertices of each edge can be covered by a small number of clusters. This problem has many applications, such as for designing medical tests, clustering files on disk servers, and placing network services on servers. The edges of the hypergraph model groups of items that are likely to be needed together, and the optimization criteria that we use can be interpreted as the average delay (or cost) to serve the items of a typical edge. We describe and analyze algorithms for this problem for the case in which the clusters have to be disjoint and for the case where clusters can overlap. The analysis is often subtle and reveals interesting structure and invariants that one can utilize.

References

[1]
IEEE. 2014. IEEE Xplore—INFOCOM 2014. Retrieved May 1, 2020 from https://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=6839150.
[2]
CAIDA. 2015. CAIDA Anonymized 2015 Internet Trace Equinix—Chicago. Retrieved May 18, 2020 from https://www.caida.org/data/passive/passive_2015_dataset.xml.
[3]
Ameer Ahmed Abbasi and Mohamed Younis. 2007. A survey on clustering algorithms for wireless sensor networks. Computer Communications 30, 14 (2007), 2826--2841.
[4]
Sameer Agarwal, Jongwoo Lim, Lihi Zelnik-Manor, Pietro Perona, David J. Kriegman, and Serge J. Belongie. 2005. Beyond pairwise clustering. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05).
[5]
Yong-Yeol Ahn, James P. Bagrow, and Sune Lehmann. 2010. Link communities reveal multiscale complexity in networks. Nature 466, 7307 (2010), 761--764.
[6]
Reid Andersen, David F. Gleich, and Vahab Mirrokni. 2012. Overlapping clusters for distributed computation. In Proceedings of the 5th International Conference on Web Search and Data Mining (WSDM’12).
[7]
Nikhil Bansal, Avrim Blum, and Shuchi Chawla. 2004. Correlation clustering. Machine Learning 56, 1–3 (2004), 89--113.
[8]
Hila Becker. 2005. A survey of correlation clustering. COMS E6998: Advanced Topics in Computational Learning Theory. Columbia University, 1–10.
[9]
Rafael Bru, Francisco Pedroche, and Daniel B. Szyld. 2005. Additive Schwarz iterations for Markov chains. SIAM Journal on Matrix Analysis and Applications 27, 2 (2005), 445--458.
[10]
Rafael Bru, Francisco Pedroche, and Daniel B. Szyld. 2005. Cálculo del vector PageRank de Google mediante el método aditivo de Schwarz. In Congreso de Métodos Numéricos en Ingeniería.
[11]
Samuel Rota Bulò and Marcello Pelillo. 2013. A game-theoretic approach to hypergraph clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 6 (2013), 1312--1327.
[12]
Vasek Chvatal. 1979. A greedy heuristic for the set-covering problem. Mathematics of Operations Research 4, 3 (1979), 233--235.
[13]
Rami Cohen, Liane Lewin-Eytan, Joseph Naor, and Danny Raz. 2015. Near optimal placement of virtual network functions. In Proceedings of the 2015 IEEE Conference on Computer Communications (INFOCOM’15).
[14]
A. J. Cole and D. Wishart. 1970. An improved algorithm for the Jardine-Sibson method of generating overlapping clusters. Computer Journal 13, 2 (1970), 156--163.
[15]
David L. Davies and Donald W. Bouldin. 1979. A cluster separation measure. IEEE Transactions on Pattern Analysis and Machine Intelligence 1, 2 (1979), 224--227.
[16]
Ran Duan. 2014. A simpler scaling algorithm for weighted matching in general graphs. arXiv:1411.1919.
[17]
Joseph C. Dunn. 1974. Well-separated clusters and optimal fuzzy partitions. Journal of Cybernetics 4, 1 (1974), 95--104.
[18]
Jack Edmonds. 1965. Paths, trees, and flowers. Canadian Journal of Mathematics 17, 3 (1965), 449--467.
[19]
Vladimir Estivill-Castro. 2002. Why so many clustering algorithms: A position paper. ACM SIGKDD Explorations Newsletter 4, 1 (2002), 65--75.
[20]
Uriel Feige. 1998. A threshold of ln n for approximating set cover. Journal of the ACM 45, 4 (1998), 634--652.
[21]
Andreas Frommer and Daniel B. Szyld. 1999. Weighted max norms, splittings, and overlapping additive Schwarz iterations. Numerische Mathematik 83, 2 (1999), 259--278.
[22]
Harold N. Gabow. 1990. Data structures for weighted matching and nearest common ancestors with linking. In Proceedings of the 1st Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’90). 434–443.
[23]
Michael R. Garey and David S. Johnson. 1979. Computers and Intractability: A Guide to the Theory of NP-Completeness. Freeman, New York, NY.
[24]
David S. Johnson. 1973. Approximation algorithms for combinatorial problems. In Proceedings of the ACM Symposium on Theory of Computing.
[25]
Richard M. Karp. 1972. Reducibility among combinatorial problems. In Complexity of Computer Computations. The IBM Research Symposia Series. IBM Thomas J. Watson Research Center, 85–103.
[26]
Marius Leordeanu and Cristian Sminchisescu. 2012. Efficient hypergraph clustering. In Proceedings of the 15th International Conference on Artificial Intelligence and Statistics (AISTATS’12).
[27]
László Lovász. 1975. On the ratio of optimal integral and fractional covers. Discrete Mathematics 13, 4 (1975), 383--390.
[28]
Nina Mishra, Robert Schreiber, Isabelle Stanton, and Robert E. Tarjan. 2007. Clustering social networks. In Algorithms and Models for the Web-Graph. Springer, 56--67.
[29]
Andrew Y. Ng, Michael I. Jordan, and Yair Weiss. 2002. On spectral clustering: Analysis and an algorithm. Advances in Neural Information Processing Systems 14 (2002), 849--856.
[30]
C. H. Papadimitriou. 1994. Computational Complexity. Addison-Wesley, Reading, MA.
[31]
M. Peiser, T. Tralau, J. Heidler, A. M. Api, J. H. E. Arts, D. A. Basketter, J. English, et al. 2012. Allergic contact dermatitis: Epidemiology, molecular mechanisms, in vitro methods and regulatory aspects. Cellular and Molecular Life Sciences 69, 5 (2012), 763--781.
[32]
Ori Rottenstreich, Isaac Keslassy, Yoram Revah, and Aviran Kadosh. 2017. Minimizing delay in network function virtualization with shared pipelines. IEEE Transactions on Parallel and Distributed Systems 28, 1 (2017), 156--169.
[33]
Amnon Shashua, Ron Zass, and Tamir Hazan. 2006. Multi-way clustering using super-symmetric non-negative tensor factorization. In Proceedings of the 9th European Conference on Computer Vision (ICCV’06). 595–608.
[34]
Daniel A. Spielmat and Shang-Hua Teng. 1996. Spectral partitioning works: Planar graphs and finite element meshes. In Proceedings of the 37th Annual Symposium on Foundations of Computer Science (FOCS’06).
[35]
Luca Trevisan. 2001. Non-approximability results for optimization problems on bounded degree instances. In Proceedings of the ACM Symposium on Theory of Computing.
[36]
Ulrike Von Luxburg. 2007. A tutorial on spectral clustering. Statistics and Computing 17, 4 (2007), 395--416.
[37]
Dengyong Zhou, Jiayuan Huang, and Bernhard Schölkopf. 2006. Learning with hypergraphs: Clustering, classification, and embedding. In Advances in Neural Information Processing Systems 19 (NIPS’06).

Index Terms

  1. Clustering in Hypergraphs to Minimize Average Edge Service Time

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Algorithms
      ACM Transactions on Algorithms  Volume 16, Issue 3
      July 2020
      368 pages
      ISSN:1549-6325
      EISSN:1549-6333
      DOI:10.1145/3403658
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 06 June 2020
      Online AM: 07 May 2020
      Accepted: 01 February 2020
      Received: 01 March 2019
      Published in TALG Volume 16, Issue 3

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Clustering
      2. average cover time
      3. hypergraphs
      4. set cover

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Funding Sources

      • Gordon Fund for System Engineering
      • Israel National Cyber Directorate
      • Technion Hiroshi Fujiwara Cyber Security Research Center
      • Israel Science Foundation (ISF)
      • German-Israeli Science Foundation (GIF)
      • Taub Family Foundation

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 108
        Total Downloads
      • Downloads (Last 12 months)7
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 03 Oct 2024

      Other Metrics

      Citations

      View Options

      Get Access

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media