Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

A new formulation of sparse multiple kernel k$$ k $$‐means clustering and its applications

Published: 01 September 2023 Publication History

Abstract

Multiple kernel k$$ k $$‐means (MKKM) clustering has been an important research topic in statistical machine learning and data mining over the last few decades. MKKM combines a group of prespecified base kernels to improve the clustering performance. Although many efforts have been made to improve the performance of MKKM further, the present works do not sufficiently consider the potential structure of the partition matrix. In this paper, we propose a novel sparse multiple kernel k$$ k $$‐means (SMKKM) clustering by introducing a ℓ1$$ {\ell}_1 $$‐norm to induce the sparsity of the partition matrix. We then design an efficient alternating algorithm with curve search technology. More importantly, the convergence and complexity analysis of the designed algorithm are established based on the optimality conditions of the SMKKM. Finally, extensive numerical experiments on synthetic and benchmark datasets demonstrate that the proposed method outperforms the state‐of‐the‐art methods in terms of clustering performance and robustness.

References

[1]
S. Bang, Y. Yu, and W. Wu. Robust multiple kernel k$$ k $$‐means clustering using min‐max optimization, arXiv:1803.02458. 2018.
[2]
N. S. Berry and R. Maitra, TiK‐means: Transformation‐infused K‐means clustering for skewed groups, Stat. Anal. Data min. 12 (2019), 223–233.
[3]
S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, Distributed optimization and statistical learning via the alternating direction method of multipliers, Found. Trends. Mach. Learn. 3 (2011), 1–122.
[4]
H. Chen, L. Kong, and Y. Li, Nonconvex clustering via l0$$ {\ell}_0 $$ fusion penalized regression, Pattern Recogn. 128 (2022), 108689.
[5]
X. Chen, J. Z. Huang, Q. Wu, and M. Yang, Subspace weighting co‐clustering of gene expression data, IEEE/ACM Trans. Comput. Biol. Bioinform. 16 (2017), 352–364.
[6]
E. C. Chi and K. Lange, Splitting methods for convex clustering, J. Comput. Graph. Stat. 24 (2015), 994–1013.
[7]
D. M. Christopher, R. Prabhakar, and S. Hinrich, Introduction to information retrieval, Cambridge University Press, Cambridge, 2008.
[8]
T. Damoulas and M. A. Girolami, Probabilistic multi‐class multi‐kernel learning: On protein fold recognition and remote homology detection, Bioinformatics 24 (2008), 1264–1270.
[9]
W. E. Donath and A. J. Hoffman, Lower bounds for the partitioning of graphs, IBM J. Res. Dev. 17 (1973), 420–425.
[10]
L. Du, P. Zhou, L. Shi, et al., “Robust multiple kernel k$$ k $$‐means using ℓ2,1$$ {\mathrm{\ell}}_{2,1} $$‐norm,” Proceedings of Twenty‐Fourth International Joint Conference on Artificial Intelligence, AAAI Press, Buenos Aires, Argentina, 2015, pp. 3476–3482.
[11]
E. Elhamifar and R. Vidal, Sparse subspace clustering: Algorithm, theory, and applications, IEEE Trans. Pattern Anal. Mach. Intell. 35 (2013), 2765–2781.
[12]
K. Fan, On a theorem of Weyl concerning eigenvalues of linear transformations I, Proc. Natl. Acad. Sci. 35 (1949), 652–655.
[13]
M. R. Garey and D. S. Johnson, Computers and intractability: A guide to the theory of NP‐completeness, W. H. Freeman, New York, 1979.
[14]
M. Girolami, Mercer kernel‐based clustering in feature space, IEEE Trans. Neural Netw. 13 (2002), 780–784.
[15]
M. Gönen and A. A. Margolin, “Localized data fusion for kernel k‐means clustering with application to cancer biology,” Proceedings of Advances in Neural Information Processing Systems, The MIT Press, Montreal, Quebec, Canada, 2014, pp. 1305–1313.
[16]
M. Gong, Y. Liang, J. Shi, W. Ma, and J. Ma, Fuzzy c‐means clustering with local information and kernel metric for image segmentation, IEEE Trans. Image Process. 22 (2012), 573–584.
[17]
M. S. Handcock, A. E. Raftery, and J. M. Tantrum, Model‐based clustering for social networks, J. R. Stat. Soc. Ser. A Stat. Soc. 170 (2007), 301–354.
[18]
H. C. Huang, Y. Y. Chuang, and C. S. Chen, Multiple kernel fuzzy clustering, IEEE Trans. Fuzzy Syst. 20 (2011), 120–134.
[19]
D. Jiang, C. Tang, and A. Zhang, Cluster analysis for gene expression data: A survey, IEEE Trans. Knowl. Data Eng. 16 (2004), 1370–1386.
[20]
C. Jin, P. Netrapalli, and M. Jordan, “What is local optimality in nonconvex‐nonconcave minimax optimization?” Proceedings of International Conference on Machine Learning, PMLR, 2020, pp. 4880–4889.
[21]
S. C. Johnson, Hierarchical clustering schemes, Psychometrika 32 (1967), 241–254.
[22]
X. Liu, SimpleMKKM: Simple multiple kernel k$$ k $$‐means, IEEE Trans. Pattern Anal. Mach. Intell. 45 (2022), 5174–5186.
[23]
X. Liu, Y. Dou, J. Yin, L. Wang, and E. Zhu, “Multiple kernel k‐means clustering with matrix‐induced regularization,” Proceedings of the AAAI Conference on Artificial Intelligence, AAAI Press, Phoenix, Arizona, USA, 2016, pp. 1888–1894.
[24]
S. Lu, I. Tsaknakis, and M. Hong, “Block alternating optimization for non‐convex min‐max problems: Algorithms and applications in signal processing and communications,” Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE Xplore, Brighton, UK, 2019, pp. 4754–4758.
[25]
J. MacQueen, “Some methods for classification and analysis of multivariate observations,” Proceedings of 5th Berkeley Symp. Math. Statist. Probability, University of California, Los Angeles, LA, USA, 1967, pp. 281–297.
[26]
G. Menardi, Nonparametric clustering for image segmentation, Stat. Anal. Data min. 13 (2020), 83–97.
[27]
F. Murtagh and P. Contreras, Algorithms for hierarchical clustering: An overview, II, Wiley Interdiscip. Rev. Data min. Knowl. Discov. 7 (2017), e1219.
[28]
A. Ng, M. Jordan, and Y. Weiss, “On spectral clustering: Analysis and an algorithm,” Proceedings of Advances in Neural Information Processing Systems, The MIT Press, Vancouver, BC, Canada, 2001, pp. 849–856.
[29]
A. Pister, P. Buono, J. D. Fekete, C. Plaisant, and P. Valdivia, Integrating prior knowledge in mixed‐initiative social network clustering, IEEE Trans. Vis. Comput. Graph. 27 (2020), 1775–1785.
[30]
A. Rakotomamonjy, F. Bach, S. Canu, and Y. Grandvalet, SimpleMKL, J. Mach. Learn. Res. 9 (2008), 2491–2521.
[31]
R. T. Rockafellar and R. J. B. Wets, Variational analysis, Springer, Berlin, 1998.
[32]
B. Schölkopf, A. Smola, and K. R. Müller, Nonlinear component analysis as a kernel eigenvalue problem, Neural Comput. 10 (1998), 1299–1319.
[33]
J. Shawe‐Taylor and N. Cristianini, Kernel methods for pattern analysis, Cambridge University Press, Cambridge, 2004.
[34]
J. Shi and J. Malik, Normalized cuts and image segmentation, IEEE Trans. Pattern Anal. Mach. Intell. 22 (2000), 888–905.
[35]
D. Sun, K. C. Toh, and Y. Yuan, Convex clustering: Model, theoretical guarantee and efficient algorithm, J. Mach. Learn. Res. 22 (2022), 427–458.
[36]
W. Sun and Y. Yuan, Optimization theory and methods: Nonlinear programming, Springer Science and Business Media, Berlin, 2006.
[37]
J. J. Thiagarajan, K. N. Ramamurthy, and A. Spanias, Multiple kernel sparse representations for supervised and unsupervised learning, IEEE Trans. Image Process. 23 (2014), 2905–2915.
[38]
L. Van der Maaten and G. Hinton, Visualizing data using t‐SNE, J. Mach. Learn. Res. 9 (2008), 2579–2605.
[39]
L. A. Vese and S. J. Osher, Numerical methods for p‐harmonic flows and applications to image processing, SIAM J. Numer. Anal. 40 (2002), 2085–2104.
[40]
R. Wang, J. Lu, Y. Lu, F. Nie, and X. Li, Discrete and parameter‐free multiple kernel k$$ k $$‐means, IEEE Trans. Image Process. 31 (2022), 2796–2808.
[41]
Z. Wen and W. Yin, A feasible method for optimization with orthogonality constraints, Math. Program. 142 (2013), 397–434.
[42]
X. Wu, V. Kumar, J. Ross Quinlan, J. Ghosh, Q. Yang, H. Motoda, G. J. McLachlan, A. Ng, B. Liu, P. S. Yu, Z. H. Zhou, M. Steinbach, D. J. Hand, and D. Steinberg, Top 10 algorithms in data mining, Knowl. Inf. Syst. 14 (2008), 1–37.
[43]
J. Xu, M. Yu, L. Shao, W. Zuo, D. Meng, L. Zhang, and D. Zhang, Scaled simplex representation for subspace clustering, IEEE Trans. Cybern. 51 (2021), 1493–1505.
[44]
Y. Yang, D. Xu, F. Nie, S. Yan, and Y. Zhuang, Image clustering using local discriminant models and global integration, IEEE Trans. Image Process. 19 (2010), 2761–2773.
[45]
Y. Yao, Y. Li, B. Jiang, and H. Chen, Multiple kernel k$$ k $$‐means clustering by selecting representative kernels, IEEE Trans. Neural Networks Learn. Syst. 32 (2021), 4983–4996.
[46]
H. Zha, X. He, C. Ding, M. Gu, and H. Simon, “Spectral relaxation for k$$ k $$‐means clustering,” Proceedings of Advances in Neural Information Processing Systems, The MIT Press, Vancouver, BC, Canada, 2001, pp. 1057–1064.
[47]
L. Zhang, W. Zhou, and L. Jiao, Kernel clustering algorithm, Chinese J. Comput. 25 (2002), 587–590.
[48]
B. Zhao, J. T. Kwok, and C. Zhang, “Multiple kernel clustering,” Proceedings of the 2009 SIAM international conference on data mining, Sparks, Nevada, 2009, pp. 638–649.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Statistical Analysis and Data Mining
Statistical Analysis and Data Mining  Volume 16, Issue 5
October 2023
100 pages
ISSN:1932-1864
EISSN:1932-1872
DOI:10.1002/sam.v16.5
Issue’s Table of Contents

Publisher

John Wiley & Sons, Inc.

United States

Publication History

Published: 01 September 2023

Author Tags

  1. alternating direction method of multipliers
  2. curvilinear search technology
  3. multiple kernel k$$ k $$‐means clustering
  4. sparse optimization

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Nov 2024

Other Metrics

Citations

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media