article

PHA: A fast potential-based hierarchical agglomerative clustering method

Authors:

Yonggang Lu and

Yi WanAuthors Info & Claims

Pattern Recognition, Volume 46, Issue 5

Pages 1227 - 1239

https://doi.org/10.1016/j.patcog.2012.11.017

Published: 01 May 2013 Publication History

Abstract

A novel potential-based hierarchical agglomerative (PHA) clustering method is proposed. In this method, we first construct a hypothetical potential field of all the data points, and show that this potential field is closely related to nonparametric estimation of the global probability density function of the data points. Then we propose a new similarity metric incorporating both the potential field which represents global data distribution information and the distance matrix which represents local data distribution information. Finally we develop another equivalent similarity metric based on an edge weighted tree of all the data points, which leads to a fast agglomerative clustering algorithm with time complexity O(N^2). The proposed PHA method is evaluated by comparing with six other typical agglomerative clustering methods on four synthetic data sets and two real data sets. Experiments show that it runs much faster than the other methods and produces the most satisfying results in most cases.

References

[1]

Omran, M.G., Engelbrecht, A.P. and Salman, A., An overview of clustering methods. Intelligent Data Analysis. v11 i6. 583-605.

[2]

Xu, R. and Wunsch, D.I.I., Survey of clustering algorithms. IEEE Transactions on Neural Networks. v16 i3. 645-678.

[3]

H. Yu, M. Gerstein, Genomic analysis of the hierarchical structure of regulatory networks, in: Proceedings of the National Academy of Sciences of USA, October 3, 2006, 103(40), pp. 14724-14731.

[4]

Loewenstein, Y., Elon, P., Fromer, M. and Linial, M., Efficient algorithms for accurate hierarchical clustering of huge datasets: Tackling the entire protein space. Bioinformatics. v24. i41-9.

[5]

M. Balcan, P. Gupta, Robust hierarchical clustering, in: Proceedings of the 23rd Conference on Learning Theory (COLT), Haifa, Israel, June 27-29, 2010, pp. 282-294.

[6]

K.A. Heller, Z. Ghahramani, Bayesian hierarchical clustering, in: Proceedings of the 22nd International Conference on Machine Learning, Bonn, Germany, August 7-11, 2005, vol. 22, pp. 297-304.

Digital Library

[7]

Teh, Y.W., Daume III, H. and Roy, D., Bayesian agglomerative clustering with coalescents. In: Platt, J.C., Koller, D., Singer, Y., Roweis, S. (Eds.), Advances in Neural Information Processing Systems 20, MIT Press, Cambridge, MA. pp. 1473-1480.

[8]

Jain, A.K., Data clustering: 50 years beyond K-means. Pattern Recognition Letters. v31. 651-666.

[9]

Hore, P., Hall, L.O. and Goldgof, D.B., A scalable framework for cluster ensembles. Pattern Recognition. v42 i5. 676-688.

[10]

Mirzaei, A. and Rahmati, M., A novel hierarchical clustering combination scheme based on fuzzy-similarity relations. IEEE Transactions on Fuzzy Systems. v18 i1. 27-39.

[11]

S. Shi, G. Yang, D. Wang, W. Zheng, Potential-based hierarchical blustering, in: Proceedings of the 16th International Conference on Pattern Recognition, Quebec, Canada, August 11-15, 2002, vol. 4, pp. 272-275.

[12]

H. Yamachi, Y. Kambayashi, Y. Tsujimura, A clustering method based on potential field, in: Proceedings of the 10th Asia Pacific Industrial Engineering and Management System Conference (APIEMS), Kitakyushu, Japan, Dec. 14-16, 2009, pp. 846-855.

[13]

Li, J. and Fu, H., Molecular dynamics-like data clustering approach. Pattern Recognition. v44. 1721-1737.

[14]

SPICKER: A clustering approach to identify near-native protein folds. Journal of Computational Chemistry. v25 i6. 865-871.

[15]

Lu, Y. and Wan, Y., Clustering by sorting potential values (CSPV): A novel potential-based clustering method. Pattern Recognition. v45 i9. 3512-3522.

[16]

J. Kleinberg, An impossibility theorem for clustering. in: S. Becker, S. Thrun, K. Obermayer (Eds.), Proceedings of the Advances in Neural Information Processing Systems (NIPS) 15, Vancouver, British Columbia, Canada, December 9-14, 2002, pp. 463-470.

[17]

Parzen, E., On estimation of a probability density function and mode. Annals of Mathematical Statistics. v33. 1065-1076.

[18]

A. Frank, A. Asuncion, UCI Machine Learning Repository {http://archive.ics.uci.edu/ml}, 2010.

[19]

Fowlkes, E.B. and Mallows, C.L., A method for comparing two hierarchical clusterings. Journal of the American Statistical Association. v78. 553-569.

Cited By

Han XZhu YTing KLi G(2023)The impact of isolation kernel on agglomerative hierarchical clustering algorithmsPattern Recognition10.1016/j.patcog.2023.109517139:COnline publication date: 1-Jul-2023
https://dl.acm.org/doi/10.1016/j.patcog.2023.109517
Qiu TLi Y(2023)Hierarchical nearest neighbor descent, in-tree, and clusteringPattern Recognition10.1016/j.patcog.2023.109300137:COnline publication date: 1-May-2023
https://dl.acm.org/doi/10.1016/j.patcog.2023.109300
Han XZhu YTing KZhan DLi GZhang ARangwala H(2022)Streaming Hierarchical Clustering Based on Point-Set KernelProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3534678.3539323(525-533)Online publication date: 14-Aug-2022
https://dl.acm.org/doi/10.1145/3534678.3539323
Show More Cited By

Index Terms

PHA: A fast potential-based hierarchical agglomerative clustering method
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Unsupervised learning
        Cluster analysis

Index terms have been assigned to the content through auto-classification.

Recommendations

A potential-based clustering method with hierarchical optimization

This work proposes a novel data clustering algorithm based on the potential field model, with a hierarchical optimization mechanism on the algorithm. There are two stages in this algorithm. Firstly, we build an edge-weighted tree based on the mutual ...
Read More
Clustering by Sorting Potential Values (CSPV): A novel potential-based clustering method

A novel clustering method called Clustering by Sorting Potential Values (CSPV) is proposed. The clustering is done in an efficient tree-growing fashion based on both the distances and the hypothetical potential values produced from the distribution of ...
Read More
GCHL: A grid-clustering algorithm for high-dimensional very large spatial data bases

Spatial clustering, which groups similar spatial objects into classes, is an important component of spatial data mining [Han and Kamber, Data Mining: Concepts and Techniques, 2000]. Due to its immense applications in various areas, spatial clustering ...
Read More

Comments

Information & Contributors

Information

Published In

cover image Pattern Recognition

Pattern Recognition Volume 46, Issue 5

May, 2013

296 pages

ISSN:0031-3203

Issue’s Table of Contents

Copyright © Elsevier Ltd © 2012.

Publisher

Elsevier Science Inc.

United States

Publication History

Published: 01 May 2013

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

14
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Other Metrics

View Author Metrics

Citations

Cited By

Han XZhu YTing KLi G(2023)The impact of isolation kernel on agglomerative hierarchical clustering algorithmsPattern Recognition10.1016/j.patcog.2023.109517139:COnline publication date: 1-Jul-2023
https://dl.acm.org/doi/10.1016/j.patcog.2023.109517
Qiu TLi Y(2023)Hierarchical nearest neighbor descent, in-tree, and clusteringPattern Recognition10.1016/j.patcog.2023.109300137:COnline publication date: 1-May-2023
https://dl.acm.org/doi/10.1016/j.patcog.2023.109300
Han XZhu YTing KZhan DLi GZhang ARangwala H(2022)Streaming Hierarchical Clustering Based on Point-Set KernelProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3534678.3539323(525-533)Online publication date: 14-Aug-2022
https://dl.acm.org/doi/10.1145/3534678.3539323
Zhu YTing KJin YAngelova M(2022)Hierarchical clustering that takes advantage of both density-peak and density-connectivityInformation Systems10.1016/j.is.2021.101871103:COnline publication date: 1-Jan-2022
https://dl.acm.org/doi/10.1016/j.is.2021.101871
Ran XXi YLu YWang XLu Z(2022)Comprehensive survey on hierarchical clustering algorithms and the recent developmentsArtificial Intelligence Review10.1007/s10462-022-10366-356:8(8219-8264)Online publication date: 26-Dec-2022
https://dl.acm.org/doi/10.1007/s10462-022-10366-3
İnkaya T(2022)Consensus similarity graph construction for clusteringPattern Analysis & Applications10.1007/s10044-022-01116-w26:2(703-733)Online publication date: 27-Nov-2022
https://dl.acm.org/doi/10.1007/s10044-022-01116-w
Abbas MEl-Zoghabi AShoukry A(2021)DenMunePattern Recognition10.1016/j.patcog.2020.107589109:COnline publication date: 1-Jan-2021
https://dl.acm.org/doi/10.1016/j.patcog.2020.107589
Liu XLiu YXie QLi LLi Z(2018)A potential-based clustering method with hierarchical optimizationWorld Wide Web10.1007/s11280-017-0509-221:6(1617-1635)Online publication date: 1-Nov-2018
https://dl.acm.org/doi/10.1007/s11280-017-0509-2
Lu YHou XChen X(2016)A novel travel-time based similarity measure for hierarchical clusteringNeurocomputing10.1016/j.neucom.2015.01.090173:P1(3-8)Online publication date: 15-Jan-2016
https://dl.acm.org/doi/10.1016/j.neucom.2015.01.090
Qiu BCao X(2016)Clustering boundary detection for high dimensional space based on space inversion and Hopkins statisticsKnowledge-Based Systems10.1016/j.knosys.2016.01.03598:C(216-225)Online publication date: 15-Apr-2016
https://dl.acm.org/doi/10.1016/j.knosys.2016.01.035
Show More Cited By

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents