Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Self-paced Adaptive Bipartite Graph Learning for Consensus Clustering

Published: 27 February 2023 Publication History

Abstract

Consensus clustering provides an elegant framework to aggregate multiple weak clustering results to learn a consensus one that is more robust and stable than a single result. However, most of the existing methods usually use all data for consensus learning, whereas ignoring the side effects caused by some unreliable or difficult data. To address this issue, in this article, we propose a novel self-paced consensus clustering method with adaptive bipartite graph learning to gradually involve data from more reliable to less reliable ones in consensus learning. At first, we construct an initial bipartite graph from the base results, where the nodes represent the clusters and instances, and the edges indicate that an instance belongs to a cluster. Then, we adaptively learn a structured bipartite graph from this initial one by self-paced learning, i.e., we automatically determine the reliability of each edge with adaptive cluster similarity measuring and involve the edges in bipartite graph learning in order of their reliability. At last, we obtain the final consensus result from the learned structured bipartite graph. We conduct extensive experiments on both toy and benchmark datasets, and the results show the effectiveness and superiority of our method. The codes of this article are released in http://Doctor-Nobody.github.io/codes/code_SCCABG.zip.

References

[1]
Sadr-olah Abbasi, Samad Nejatian, Hamid Parvin, Vahideh Rezaie, and Karamolah Bagherifard. 2019. Clustering ensemble selection considering quality and diversity. Artificial Intelligence Review 52, 2 (2019), 1311–1340.
[2]
Javad Azimi and Xiaoli Z. Fern. 2009. Adaptive cluster ensemble selection. In Proceedings of the 21st International Joint Conference on Artificial Intelligence, IJCAI, Craig Boutilier (Ed.). 992–997.
[3]
Ali Bagherinia, Behrooz Minaei-Bidgoli, Mehdi Hossinzadeh, and Hamid Parvin. 2019. Elite fuzzy clustering ensemble based on clustering diversity and quality measures. Applied Intelligence 49, 5 (2019), 1724–1747.
[4]
Liang Bai, Jiye Liang, and Fuyuan Cao. 2020. A multiple k-means clustering ensemble algorithm to find nonlinearly separable clusters. Information Fusion 61 (2020), 36–47.
[5]
Sumit Basu and Janara Christensen. 2013. Teaching classification boundaries to humans. In Proceedings of the AAAI Conference on Artificial Intelligence. 109–115.
[6]
Yoshua Bengio, Jerome Louradour, Ronan Collobert, and Jason Weston. 2009. Curriculum learning. In Proceedings of the 26th Annual International Conference on Machine Learning. 41–48.
[7]
Deng Cai, Xiaofei He, and Jiawei Han. 2011. Locally consistent concept factorization for document clustering. IEEE Transactions on Knowledge and Data Engineering 23, 6 (2011), 902–913.
[8]
Xiao Cai, Feiping Nie, and Heng Huang. 2013. Multi-view K-means clustering on big data. In Proceedings of the 23rd International Joint Conference on Artificial Intelligence, IJCAI. 2598–2604.
[9]
Andrew R. Conn, Nicholas I. M. Gould, and Philippe L. Toint. 2000. Trust Region Methods. SIAM.
[10]
Ky Fan. 1949. On a theorem of Weyl concerning eigenvalues of linear transformations: II*. Proceedings of the National Academy of Sciences of the United States of America 36, 1 (1949), 31–35.
[11]
Jing Gao, Jiawei Han, Jialu Liu, and Chi Wang. 2013. Multi-view clustering via joint nonnegative matrix factorization. In Proceedings of the 13th SIAM International Conference on Data Mining, 2013. SIAM, 252–260.
[12]
Todd R. Golub, Donna K. Slonim, Pablo Tamayo, Christine Huard, Michelle Gaasenbeek, Jill P. Mesirov, Hilary Coller, Mignon L. Loh, James R. Downing, Mark A. Caligiuri, Clara D. Bloomfield, and Eric S. Lander. 1999. Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 286, 5439 (1999), 531–537.
[13]
Xifeng Guo, Xinwang Liu, En Zhu, Xinzhong Zhu, Miaomiao Li, Xin Xu, and Jianping Yin. 2020. Adaptive self-paced deep clustering with data augmentation. IEEE Transactions on Knowledge and Data Engineering 32, 9 (2020), 1680–1693.
[14]
Dong Huang, Jianhuang Lai, and Changdong Wang. 2016. Ensemble clustering using factor graph. Pattern Recognition 50 (2016), 131–142.
[15]
Dong Huang, Jianhuang Lai, and Changdong Wang. 2016. Robust ensemble clustering using probability trajectories. IEEE Transactions on Knowledge and Data Engineering 28, 5 (2016), 1312–1326.
[16]
Dong Huang, Chang-Dong Wang, and Jian-Huang Lai. 2018. Locally weighted ensemble clustering. IEEE Transactions on Cybernetics 48, 5 (2018), 1460–1473. DOI:
[17]
Dong Huang, Chang-Dong Wang, Hongxing Peng, Jianhuang Lai, and Chee-Keong Kwoh. 2021. Enhanced ensemble clustering via fast propagation of cluster-wise similarities. IEEE Transactions on Systems, Man, and Cybernetics: Systems 51, 1 (2021), 508–520. DOI:
[18]
Dong Huang, Chang-Dong Wang, Jian-Sheng Wu, Jian-Huang Lai, and Chee-Keong Kwoh. 2020. Ultra-scalable spectral clustering and ensemble clustering. IEEE Transactions on Knowledge and Data Engineering 32, 6 (2020), 1212–1226. DOI:
[19]
Lu Jiang, Deyu Meng, Qian Zhao, Shiguang Shan, and Alexander G. Hauptmann. 2015. Self-paced curriculum learning. In Proceedings of the 29th AAAI Conference on Artificial Intelligence. 2694–2700.
[20]
Yangbangyan Jiang, Zhiyong Yang, Qianqian Xu, Xiaochun Cao, and Qingming Huang. 2018. When to learn what: Deep cognitive subspace clustering. In Proceedings of the 26th ACM International Conference on Multimedia. ACM, 718–726.
[21]
Zhao Kang, Wangtao Zhou, Zhitong Zhao, Junming Shao, Meng Han, and Zenglin Xu. 2020. Large-scale multi-view subspace clustering in linear time. In Proceedings of the 34th AAAI Conference on Artificial Intelligence, AAAI. 4412–4419.
[22]
M. P. Kumar, Benjamin Packer, and Daphne Koller. 2010. Self-paced learning for latent variable models. In Proceedings of the Advances in Neural Information Processing Systems. 1189–1197.
[23]
Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner. 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 11 (1998), 2278–2324.
[24]
Changsheng Li, Junchi Yan, Fan Wei, Weishan Dong, Qingshan Liu, and Hongyuan Zha. 2017. Self-paced multi-task learning. In Proceedings of the 31st AAAI Conference on Artificial Intelligence, AAAI. 2175–2181.
[25]
Feijiang Li, Yuhua Qian, Jieting Wang, Chuangyin Dang, and Liping Jing. 2019. Clustering ensemble based on sample’s stability. Artificial Intelligence 273 (2019), 37–55.
[26]
Jundong Li, Kewei Cheng, Suhang Wang, Fred Morstatter, Robert P. Trevino, Jiliang Tang, and Huan Liu. 2018. Feature selection: A data perspective. ACM Computing Surveys (CSUR) 50, 6 (2018), 94.
[27]
Ping Li, Chun Chen, and Jiajun Bu. 2012. Clustering analysis using manifold kernel concept factorization. Neurocomputing 87 (2012), 120–131.
[28]
Tao Li and Chris H. Q. Ding. 2008. Weighted consensus clustering. In Proceedings of the 2008 SIAM International Conference on Data Mining. 798–809.
[29]
Tao Li, Chris H. Q. Ding, and Michael I. Jordan. 2007. Solving consensus and semi-supervised clustering problems using nonnegative matrix factorization. In Proceedings of the 7th IEEE International Conference on Data Mining. 577–582.
[30]
Hongfu Liu, Tongliang Liu, Junjie Wu, Dacheng Tao, and Yun Fu. 2015. Spectral ensemble clustering. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 715–724.
[31]
Hongfu Liu, Ming Shao, Sheng Li, and Yun Fu. 2018. Infinite ensemble clustering. Data Mining and Knowledge Discovery 32, 2 (2018), 385–416.
[32]
Xinwang Liu, Miaomiao Li, Chang Tang, Jingyuan Xia, Jian Xiong, Li Liu, Marius Kloft, and En Zhu. 2021. Efficient and effective regularized incomplete multi-view clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence 43, 8 (2021), 2634–2646.
[33]
Xinwang Liu, Miaomiao Li, Chang Tang, Jingyuan Xia, Jian Xiong, Li Liu, Marius Kloft, and En Zhu. 2021. Efficient and effective regularized incomplete multi-view clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence 43, 8 (2021), 2634–2646.
[34]
Xinwang Liu, Xinzhong Zhu, Miaomiao Li, Lei Wang, Chang Tang, Jianping Yin, Dinggang Shen, Huaimin Wang, and Wen Gao. 2019. Late fusion incomplete multi-view clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence 41, 10 (2019), 2410–2423.
[35]
Deyu Meng, Qian Zhao, and Lu Jiang. 2017. A theoretical understanding of self-paced learning. Information Sciences 414 (2017), 319–328.
[36]
Feiping Nie, Xiaoqian Wang, Cheng Deng, and Heng Huang. 2017. Learning a structured optimal bipartite graph for co-clustering. In Proceedings of the Advances in Neural Information Processing Systems. 4129–4138.
[37]
Feiping Nie, Xiaoqian Wang, and Heng Huang. 2014. Clustering and projected clustering with adaptive neighbors. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 977–986.
[38]
Lili Pan, Shijie Ai, Yazhou Ren, and Zenglin Xu. 2020. Self-paced deep regression forests with consideration on underrepresented examples. In Proceedings of the European Conference on Computer Vision, Vol. 12375. 271–287.
[39]
Hamid Parvin and Behrouz Minaei-Bidgoli. 2013. A clustering ensemble framework based on elite selection of weighted clusters. Advances in Data Analysis and Classification 7, 2 (2013), 181–208.
[40]
Hamid Parvin and Behrouz Minaei-Bidgoli. 2015. A clustering ensemble framework based on selection of fuzzy weighted clusters in a locally adaptive clustering algorithm. Pattern Analysis and Applications 18, 1 (2015), 87–112.
[41]
Xi Peng, Zhenyu Huang, Jiancheng Lv, Hongyuan Zhu, and Joey Tianyi Zhou. 2019. COMIC: Multi-view clustering without parameter selection. In Proceedings of the 36th International Conference on Machine Learning, Vol. 97. 5092–5101.
[42]
Yazhou Ren, Xiaofan Que, Dezhong Yao, and Zenglin Xu. 2019. Self-paced multi-task clustering. Neurocomputing 350 (2019), 212–220.
[43]
Alexander Strehl and Joydeep Ghosh. 2003. Cluster ensembles—a knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research 3, 3 (2003), 583–617.
[44]
Zhiqiang Tao, Hongfu Liu, Jun Li, Zhaowen Wang, and Yun Fu. 2019. Adversarial graph embedding for ensemble clustering. In Proceedings of the 28th International Joint Conference on Artificial Intelligence, IJCAI. 3562–3568.
[45]
Zhiqiang Tao, Hongfu Liu, Sheng Li, Zhengming Ding, and Yun Fu. 2017. From ensemble clustering to multi-view clustering. In Proceedings of the International Joint Conference on Artificial Intelligence. 2843–2849.
[46]
Zhiqiang Tao, Hongfu Liu, Sheng Li, Zhengming Ding, and Yun Fu. 2019. Robust spectral ensemble clustering via rank minimization. ACM Transactions on Knowledge Discovery From Data 13, 1 (2019), 1–25.
[47]
Zhiqiang Tao, Hongfu Liu, Sheng Li, and Yun Fu. 2016. Robust spectral ensemble clustering. In CIKM. 367–376.
[48]
Alexander Topchy, Anil K. Jain, and William F. Punch. 2003. Combining multiple weak clusterings. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. 331–338.
[49]
Alexander Topchy, Anil K. Jain, and William F. Punch. 2004. A mixture model for clustering ensembles. In Proceedings of the 2004 SIAM International Conference on Data Mining. 379–390.
[50]
Fei Wang, Xin Wang, and Tao Li. 2009. Generalized cluster aggregation. In Proceedings of the International Joint Conference on Artificial Intelligence. 1279–1284.
[51]
Hongjun Wang, Hanhuai Shan, and Arindam Banerjee. 2009. Bayesian cluster ensembles. In Proceedings of the SIAM International Conference on Data Mining. 211–222.
[52]
Siwei Wang, Xinwang Liu, En Zhu, Chang Tang, Jiyuan Liu, Jingtao Hu, Jingyuan Xia, and Jianping Yin. 2019. Multi-view clustering via late fusion alignment maximization. In Proceedings of the 28th International Joint Conference on Artificial Intelligence, IJCAI, Sarit Kraus (Ed.). 3778–3784.
[53]
Chang Xu, Dacheng Tao, and Chao Xu. 2015. Multi-view learning with incomplete views. IEEE Transactions on Image Processing 24, 12 (2015), 5812–5825.
[54]
Qiyue Yin, Shu Wu, and Liang Wang. 2015. Incomplete multi-view clustering via subspace learning. In Proceedings of the 24th ACM International Conference on Information and Knowledge Management, CIKM 2015. ACM, 383–392.
[55]
Zhiwen Yu, Le Li, Yunjun Gao, Jane You, Jiming Liu, Hausan Wong, and Guoqiang Han. 2014. Hybrid clustering solution selection strategy. Pattern Recognition 47, 10 (2014), 3362–3375.
[56]
Dingwen Zhang, Deyu Meng, and Junwei Han. 2017. Co-saliency detection via a self-paced multiple-instance learning framework. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 5 (2017), 865–878.
[57]
Yi Zhang, Xinwang Liu, Siwei Wang, Jiyuan Liu, Sisi Dai, and En Zhu. 2021. One-stage incomplete multi-view clustering via late fusion. In Proceedings of the 29th ACM International Conference on Multimedia. ACM, 2717–2725.
[58]
Qian Zhao, Deyu Meng, Lu Jiang, Qi Xie, Zongben Xu, and Alexander G. Hauptmann. 2015. Self-paced learning for matrix factorization. In Proceedings of the 29th AAAI Conference on Artificial Intelligence. 3196–3202.
[59]
Ying Zhao and George Karypis. 2004. Empirical and theoretical comparisons of selected criterion functions for document clustering. Machine Learning 55, 3 (2004), 311–331.
[60]
Jie Zhou, Hongchan Zheng, and Lulu Pan. 2019. Ensemble clustering based on dense representation. Neurocomputing 357 (2019), 66–76.
[61]
Peng Zhou, Jiangyong Chen, Liang Du, and Xuejun Li. 2022. Balanced spectral feature selection. IEEE Transactions on Cybernetics (2022), 1–13.
[62]
Peng Zhou, Liang Du, and Xuejun Li. 2020. Self-paced consensus clustering with bipartite graph. In Proceedings of the 29th International Joint Conference on Artificial Intelligence, IJCAI 2020. 2133–2139.
[63]
P. Zhou, L. Du, X. Liu, Y. Shen, M. Fan, and X. Li. 2021. Self-paced clustering ensemble. IEEE Transactions on Neural Networks and Learning Systems 32, 4 (2021), 1497–1511.
[64]
Peng Zhou, Liang Du, Yi-Dong Shen, and Xuejun Li. 2021. Tri-level robust clustering ensemble with multiple graph learning. In Proceedings of the 35th AAAI Conference on Artificial Intelligence, AAAI 2021. AAAI Press, 11125–11133.
[65]
Peng Zhou, Liang Du, Lei Shi, Hanmo Wang, and Yi-Dong Shen. 2015. Recovery of corrupted multiple kernels for clustering. In Proceedings of the 24th International Joint Conference on Artificial Intelligence, IJCAI. 4105–4111.
[66]
Peng Zhou, Liang Du, Hanmo Wang, Lei Shi, and Yidong Shen. 2015. Learning a robust consensus matrix for clustering ensemble via Kullback–Leibler divergence minimization. In Proceedings of the 24th International Joint Conference on Artificial Intelligence. 4112–4118.
[67]
Peng Zhou, Yi-Dong Shen, Liang Du, Fan Ye, and Xuejun Li. 2019. Incremental multi-view spectral clustering. Knowledge-Based Systems 174 (2019), 73–86.
[68]
Peng Zhou, Xia Wang, Liang Du, and Xuejun Li. 2022. Clustering ensemble via structured hypergraph learning. Information Fusion 78 (2022), 171–179.
[69]
Zhihua Zhou and Wei Tang. 2006. Clusterer ensemble. Knowledge Based Systems 19, 1 (2006), 77–83.

Cited By

View all
  • (2025)Fair Clustering Ensemble With Equal Cluster CapacityIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.350785747:3(1729-1746)Online publication date: Mar-2025
  • (2025)Data-Centric Graph Learning: A SurveyIEEE Transactions on Big Data10.1109/TBDATA.2024.348941211:1(1-20)Online publication date: Feb-2025
  • (2024)Active Clustering Ensemble With Self-Paced LearningIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.325258635:9(12186-12200)Online publication date: Sep-2024
  • Show More Cited By

Index Terms

  1. Self-paced Adaptive Bipartite Graph Learning for Consensus Clustering

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Knowledge Discovery from Data
    ACM Transactions on Knowledge Discovery from Data  Volume 17, Issue 5
    June 2023
    386 pages
    ISSN:1556-4681
    EISSN:1556-472X
    DOI:10.1145/3583066
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 February 2023
    Online AM: 27 September 2022
    Accepted: 19 September 2022
    Revised: 31 August 2022
    Received: 28 January 2022
    Published in TKDD Volume 17, Issue 5

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Consensus clustering
    2. clustering ensemble
    3. bipartite graph learning
    4. self-paced learning

    Qualifiers

    • Research-article

    Funding Sources

    • National Natural Science Foundation of China
    • Natural Science Foundation of Anhui Province

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)182
    • Downloads (Last 6 weeks)17
    Reflects downloads up to 09 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Fair Clustering Ensemble With Equal Cluster CapacityIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.350785747:3(1729-1746)Online publication date: Mar-2025
    • (2025)Data-Centric Graph Learning: A SurveyIEEE Transactions on Big Data10.1109/TBDATA.2024.348941211:1(1-20)Online publication date: Feb-2025
    • (2024)Active Clustering Ensemble With Self-Paced LearningIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.325258635:9(12186-12200)Online publication date: Sep-2024
    • (2024)Partial Clustering EnsembleIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.332191336:5(2096-2109)Online publication date: May-2024
    • (2024)Jointly Learn the Base Clustering and Ensemble for Deep Image Clustering2024 IEEE International Conference on Multimedia and Expo (ICME)10.1109/ICME57554.2024.10687406(1-6)Online publication date: 15-Jul-2024
    • (2024)Higher Order Multiple Graph Filtering for Structured Graph LearningICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP48485.2024.10446826(7095-7099)Online publication date: 14-Apr-2024
    • (2024)K-Means Clustering Based on Chebyshev Polynomial Graph FilteringICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP48485.2024.10446384(7175-7179)Online publication date: 14-Apr-2024
    • (2024)A clustering ensemble algorithm for handling deep embeddings using cluster confidenceThe Computer Journal10.1093/comjnl/bxae101Online publication date: 14-Oct-2024
    • (2024)Multi-view Outlier Detection via Graphs DenoisingInformation Fusion10.1016/j.inffus.2023.102012101:COnline publication date: 1-Jan-2024
    • (2024)Ensemble clustering with low-rank optimal Laplacian matrix learningApplied Soft Computing10.1016/j.asoc.2023.111095150:COnline publication date: 12-Apr-2024
    • Show More Cited By

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media