Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Public Access

Hierarchical Clustering: Objective Functions and Algorithms

Published: 05 June 2019 Publication History

Abstract

Hierarchical clustering is a recursive partitioning of a dataset into clusters at an increasingly finer granularity. Motivated by the fact that most work on hierarchical clustering was based on providing algorithms, rather than optimizing a specific objective, Dasgupta framed similarity-based hierarchical clustering as a combinatorial optimization problem, where a “good” hierarchical clustering is one that minimizes a particular cost function [23]. He showed that this cost function has certain desirable properties: To achieve optimal cost, disconnected components (namely, dissimilar elements) must be separated at higher levels of the hierarchy, and when the similarity between data elements is identical, all clusterings achieve the same cost.
We take an axiomatic approach to defining “good” objective functions for both similarity- and dissimilarity-based hierarchical clustering. We characterize a set of admissible objective functions having the property that when the input admits a “natural” ground-truth hierarchical clustering, the ground-truth clustering has an optimal value. We show that this set includes the objective function introduced by Dasgupta.
Equipped with a suitable objective function, we analyze the performance of practical algorithms, as well as develop better and faster algorithms for hierarchical clustering. We also initiate a beyond worst-case analysis of the complexity of the problem and design algorithms for this scenario.

References

[1]
Sanjeev Arora and Ravi Kannan. 2001. Learning mixtures of arbitrary Gaussians. In Proceedings on the 33rd Annual ACM Symposium on Theory of Computing. 247--257.
[2]
Sanjeev Arora, Satish Rao, and Umesh V. Vazirani. 2009. Expander flows, geometric embeddings and graph partitioning. J. ACM 56, 2 (2009).
[3]
Pranjal Awasthi, Avrim Blum, and Or Sheffet. 2010. Stability yields a PTAS for k-median and k-means clustering. In Proceedings of the 51th Annual IEEE Symposium on Foundations of Computer Science (FOCS’10). 309--318.
[4]
Pranjal Awasthi, Avrim Blum, and Or Sheffet. 2012. Center-based clustering under perturbation stability. Inf. Process. Lett. 112, 1--2 (2012), 49--54.
[5]
Pranjal Awasthi and Or Sheffet. 2012. Improved spectral-norm bounds for clustering. In Proceedings of the 15th International Workshop on Approximation, Randomization, and Combinatorial Optimization: Algorithms and Techniques (APPROX’12) and Proceedings of the 16th International Workshop (RANDOM’12). 37--49.
[6]
Maria-Florina Balcan, Avrim Blum, and Anupam Gupta. 2009. Approximate clustering without the approximation. In Proceedings of the 20th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’09). 1068--1077.
[7]
Maria-Florina Balcan, Avrim Blum, and Anupam Gupta. 2013. Clustering under approximation stability. J. ACM 60, 2 (2013), 8.
[8]
Maria-Florina Balcan and Yingyu Liang. 2016. Clustering under perturbation resilience. SIAM J. Comput. 45, 1 (2016), 102--155.
[9]
Maria-Florina Balcan, Heiko Röglin, and Shang-Hua Teng. 2009. Agnostic clustering. In Proceedings of the 20th International Conference Algorithmic Learning Theory (ALT’09). 384--398.
[10]
Maria-Florina Balcan, Avrim Blum, and Santosh Vempala. 2008. A discriminative framework for clustering via similarity functions. In Proceedings of the 40th Annual ACM Symposium on Theory of Computing (STOC’08). ACM, New York, NY, 671--680.
[11]
Yonatan Bilu, Amit Daniely, Nati Linial, and Michael E. Saks. 2013. On the practically interesting instances of MAXCUT. In Proceedings of the 30th International Symposium on Theoretical Aspects of Computer Science (STACS’13). 526--537.
[12]
Yonatan Bilu and Nathan Linial. 2012. Are stable instances easy? Combin. Probab. Comput. 21, 5 (2012), 643--660.
[13]
Ravi B. Boppana. 1987. Eigenvalues and graph bisection: An average-case analysis. In Proceedings of the 28th Annual Symposium on Foundations of Computer Science (FOCS’87). IEEE Computer Society, Los Alamitos, CA, 280--285.
[14]
S. Charles Brubaker and Santosh Vempala. 2008. Isotropic PCA and affine-invariant clustering. In Proceedings of the 49th Annual IEEE Symposium on Foundations of Computer Science (FOCS’08). 551--560.
[15]
Gunnar Carlsson and Facundo Mémoli. 2010. Characterization, stability and convergence of hierarchical clustering methods. J. Mach. Learn. Res. 11 (2010), 1425--1470.
[16]
Rui M. Castro, Mark J. Coates, and Robert D. Nowak. 2004. Likelihood based hierarchical clustering. IEEE Trans. Sign. Process. 52, 8 (2004), 2308--2321.
[17]
Moses Charikar and Vaggos Chatziafratis. 2017. Approximate hierarchical clustering via sparsest cut and spreading metrics. In Proceedings of the 28th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’17). 841--854.
[18]
Moses Charikar, Vaggos Chatziafratis, and Rad Niazadeh. 2019. Hierarchical clustering better than average-linkage. In Proceedings of the 30th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’19). 2291--2304.
[19]
Vaggos Chatziafratis, Rad Niazadeh, and Moses Charikar. 2018. Hierarchical clustering with structural constraints. In Proceedings of the 35th International Conference on Machine Learning (ICML’18). 773--782.
[20]
Vincent Cohen-Addad, Varun Kanade, and Frederik Mallmann-Trenn. 2017. Hierarchical clustering beyond the worst-case. In Advances in Neural Information Processing Systems.
[21]
Vincent Cohen-Addad, Varun Kanade, Frederik Mallmann-Trenn, and Claire Mathieu. 2018. Hierarchical clustering: Objective functions and algorithms. In Proceedings of the 29th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’18). Society for Industrial and Applied Mathematics, Philadelphia, PA, 378--397. http://dl.acm.org/citation.cfm?id=3174304.3175293.
[22]
Sanjoy Dasgupta. 1999. Learning mixtures of gaussians. In Proceedings of the 40th Annual Symposium on Foundations of Computer Science (FOCS’99). 634--644.
[23]
Sanjoy Dasgupta. 2016. A cost function for similarity-based hierarchical clustering. In Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing (STOC’16). ACM, New York, NY, 118--127.
[24]
Sanjoy Dasgupta and Philip M. Long. 2005. Performance guarantees for hierarchical clustering. J. Comput. Syst. Sci. 70, 4 (2005), 555--569.
[25]
Sanjoy Dasgupta and Leonard J. Schulman. 2007. A probabilistic analysis of EM for mixtures of separated, spherical gaussians. J. Mach. Learn. Res. 8 (2007), 203--226.
[26]
Justin Eldridge, Mikhail Belkin, and Yusu Wang. 2016. Graphons, mergeons, and so on!. In Proceedings of the Annual Conference on Advances in Neural Information Processing Systems. 2307--2315.
[27]
Joseph Felsenstein and Joseph Felenstein. 2004. Inferring Phylogenies. Vol. 2. Sinauer Associates Sunderland.
[28]
Jerome Friedman, Trevor Hastie, and Robert Tibshirani. 2001. The Elements of Statistical Learning. Vol. 1. Springer Series in Statistics, Berlin.
[29]
Wassily Hoeffding. 1963. Probability inequalities for sums of bounded random variables. J. Am. Statist. Assoc. 58, 301 (1963), 13--30.
[30]
N. Jardine and R. Sibson. 1972. Mathematical Taxonomy. John Wiley 8 Sons.
[31]
Jon Kleinberg. 2002. An impossibility theorem for clustering. In Advances in Neural Information Processing Systems, Vol. 15. 463--470.
[32]
Akshay Krishnamurthy, Sivaraman Balakrishnan, Min Xu, and Aarti Singh. 2012. Efficient active algorithms for hierarchical clustering. In Proceedings of the 29th International Coference on International Conference on Machine Learning. Omnipress, 267--274. http://dl.acm.org/citation.cfm?id=3042573.3042611
[33]
Amit Kumar and Ravindran Kannan. 2010. Clustering with spectral norm and the k-means algorithm. In Proceedings of the 51th Annual IEEE Symposium on Foundations of Computer Science (FOCS’10). 299--308.
[34]
Guolong Lin, Chandrashekhar Nagarajan, Rajmohan Rajaraman, and David P. Williamson. 2006. A general approach for incremental approximation and hierarchical clustering. In Proceedings of the 17th Annual ACM-SIAM Symposium on Discrete Algorithm. Society for Industrial and Applied Mathematics, 1147--1156.
[35]
Vince Lyzinski, Minh Tang, Avanti Athreya, Youngser Park, and Carey E. Priebe. 2017. Community detection and classification in hierarchical stochastic blockmodels. IEEE Trans. Netw. Sci. Eng. 4, 1 (2017), 13--26.
[36]
Konstantin Makarychev, Yury Makarychev, and Aravindan Vijayaraghavan. 2012. Approximation algorithms for semi-random partitioning problems. In Proceedings of the 44th Symposium on Theory of Computing Conference (STOC’12), Howard J. Karloff and Toniann Pitassi (Eds.). ACM, 367--384.
[37]
Frank McSherry. 2001. Spectral partitioning of random graphs. In Proceedings of the 42nd Annual Symposium on Foundations of Computer Science (FOCS’01). 529--537.
[38]
Benjamin Moseley and Joshua Wang. 2017. Approximation bounds for hierarchical clustering: Average linkage, bisecting K-means, and local search. In Advances in Neural Information Processing Systems, Vol. 30 I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). 3097--3106.
[39]
Rafail Ostrovsky, Yuval Rabani, Leonard J. Schulman, and Chaitanya Swamy. 2012. The effectiveness of Lloyd-type methods for the k-means problem. J. ACM 59, 6 (2012), 28.
[40]
C. Greg Plaxton. 2003. Approximation algorithms for hierarchical location problems. In Proceedings of the 35th Annual ACM Symposium on Theory of Computing. ACM, 40--49.
[41]
Aurko Roy and Sebastian Pokutta. 2016. Hierarchical clustering via spreading metrics. In Advances in Neural Information Processing Systems. 2316--2324.
[42]
Peter H. A. Sneath and Robert R. Sokal. 1962. Numerical taxonomy. Nature 193, 4818 (1962), 855--860.
[43]
Michael Steinbach, George Karypis, and Vipin Kumar. 2000. A comparison of document clustering techniques. In Proceedings of the ACM Knowledge Discovery and Data Mining Workshop on Text Mining (KDD’00).
[44]
Reza Bosagh Zadeh and Shai Ben-David. 2009. A uniqueness theorem for clustering. In Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence. 639--646.

Cited By

View all
  • (2025)FedCCL: Federated dual-clustered feature contrast under domain heterogeneityInformation Fusion10.1016/j.inffus.2024.102645113(102645)Online publication date: Jan-2025
  • (2024)Clustering of Networks Using the Fish School Search AlgorithmКластеризация сетей с использованием алгоритма поиска косяков рыбInformatics and AutomationИнформатика и автоматизация10.15622/ia.23.5.423:5(1367-1397)Online publication date: 25-Sep-2024
  • (2024)Fitting Distances by Tree Metrics Minimizing the Total Error within a Constant FactorJournal of the ACM10.1145/363945371:2(1-41)Online publication date: 10-Apr-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Journal of the ACM
Journal of the ACM  Volume 66, Issue 4
Networking, Computational Complexity, Design and Analysis of Algorithms, Real Computation, Algorithms, Online Algorithms and Computer-aided Verification
August 2019
299 pages
ISSN:0004-5411
EISSN:1557-735X
DOI:10.1145/3338848
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 June 2019
Accepted: 01 March 2019
Revised: 01 March 2019
Received: 01 December 2017
Published in JACM Volume 66, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Hierarchical clustering
  2. PCA
  3. stochastic block model

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1,756
  • Downloads (Last 6 weeks)219
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2025)FedCCL: Federated dual-clustered feature contrast under domain heterogeneityInformation Fusion10.1016/j.inffus.2024.102645113(102645)Online publication date: Jan-2025
  • (2024)Clustering of Networks Using the Fish School Search AlgorithmКластеризация сетей с использованием алгоритма поиска косяков рыбInformatics and AutomationИнформатика и автоматизация10.15622/ia.23.5.423:5(1367-1397)Online publication date: 25-Sep-2024
  • (2024)Fitting Distances by Tree Metrics Minimizing the Total Error within a Constant FactorJournal of the ACM10.1145/363945371:2(1-41)Online publication date: 10-Apr-2024
  • (2024)A Framework for Combining Lateral and Longitudinal Acceleration to Assess Driving Styles Using Unsupervised ApproachIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2023.331021325:1(638-656)Online publication date: Jan-2024
  • (2024)On the Fundamental Limits of Matrix Completion: Leveraging Hierarchical Similarity GraphsIEEE Transactions on Information Theory10.1109/TIT.2023.334590270:3(2039-2075)Online publication date: 1-Mar-2024
  • (2024)Unveiling Insights: Exploring Healthcare Data through Data Analysis2024 International Conference on Communication, Computer Sciences and Engineering (IC3SE)10.1109/IC3SE62002.2024.10593333(575-581)Online publication date: 9-May-2024
  • (2024)Morphology-Driven Nanofiller Size Measurement Integrated with Micromechanical Finite Element Analysis for Quantifying Interphase in Polymer NanocompositesACS Applied Materials & Interfaces10.1021/acsami.4c0279716:30(39927-39941)Online publication date: 17-Jul-2024
  • (2024)Sparrow search mechanism-based effective feature mining algorithm for the broken wire signal detection of prestressed concrete cylinder pipeMechanical Systems and Signal Processing10.1016/j.ymssp.2024.111270212(111270)Online publication date: Apr-2024
  • (2024)Machine learning-assisted nanosensor arrays: An efficiently high-throughput food detection analysisTrends in Food Science & Technology10.1016/j.tifs.2024.104564149(104564)Online publication date: Jul-2024
  • (2024)Sleep quality relates to language impairment in children with autism spectrum disorder without intellectual disabilitySleep Medicine10.1016/j.sleep.2024.03.028117(99-106)Online publication date: May-2024
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media