Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Fast Flexible Bipartite Graph Model for Co-Clustering

Published: 01 July 2023 Publication History

Abstract

Co-clustering methods make use of the correlation between samples and attributes to explore the co-occurrence structure in data. These methods have played a significant role in gene expression analysis, image segmentation, and document clustering. In bipartite graph partition-based co-clustering methods, the relationship between samples and attributes is described by constructing a diagonal symmetric bipartite graph matrix, which is clustered by the philosophy of spectral clustering. However, this not only has high time complexity but also the same number of row and column clusters. In fact, the number of categories of rows and columns often changes in the real world. To address these problems, this paper proposes a novel fast flexible bipartite graph model for the co-clustering method (FBGPC) that directly uses the original matrix to construct the bipartite graph. Then, it uses the inflation operation to partition the bipartite graph in order to learn the co-occurrence structure of the original data matrix based on the inherent relationship between bipartite graph partitioning and co-clustering. Finally, hierarchical clustering is used to obtain the clustering results according to the set relationship of the co-occurrence structure. Extensive empirical results show the effectiveness of our proposed model and verify the faster performance, generality, and flexibility of our model.

References

[1]
P. Berkhin, A Survey of Clustering Data Mining Techniques. Berlin, Germany: Springer, 2006, pp. 25–71.
[2]
F. Nie, X. Wang, C. Deng, and H. Huang, “Learning a structured optimal bipartite graph for co-clustering,” in Proc. 31st Int. Conf. Neural Inf. Process. Syst., 2017, pp. 4132–4141.
[3]
I. S. Dhillon, “Co-clustering documents and words using bipartite spectral graph partitioning,” in Proc. 7th ACM SIGKDD Int. Conf. Knowl. Discov. Data Mining, 2001, pp. 269–274.
[4]
J. Shi and J. Malik, “Normalized cuts and image segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 22, no. 8, pp. 192–200, Aug. 2000.
[5]
M. Gong, Y. Liang, J. Shi, W. Ma, and J. Ma, “Fuzzy c-means clustering with local information and kernel metric for image segmentation,” IEEE Trans. Image Process., vol. 22, no. 2, pp. 573–584, Feb. 2013.
[6]
Y. Shi, C. Otto, and A. K. Jain, “Face clustering: Representation and pairwise constraints,” IEEE Trans. Inf. Forensics Security, vol. 13, no. 7, pp. 1626–1640, Jul. 2018.
[7]
D. J. Higham, G. Kalna, and M. Kibble, “Spectral clustering and its use in bioinformatics,” J. Comput. Appl. Math., vol. 204, no. 1, pp. 25–37, 2007.
[8]
G. Pertea et al., “TIGR gene indices clustering tools (TGICL): A software system for fast clustering of large est datasets,” Bioinformatics, vol. 9, no. 5, pp. 651–652, 2003.
[9]
K. W.-T. Leung, D. L. Lee, and W.-C. Lee, “CLR: A collaborative location recommendation framework based on co-clustering,” in Proc. Int. ACM SIGIR Conf. Res. Develop. Inf. Retrieval, 2011, pp. 305–314.
[10]
L. Feng, Q. Zhao, and C. Zhou, “Improving performances of top-n recommendations with co-clustering method,” in Proc. Int. Conf. World Wide Web, 2020, Art. no.
[11]
M. Li, L. Wen, and F. Chen, “A novel collaborative filtering recommendation approach based on soft co-clustering,” Physica A: Statist. Mechanics Appl., vol. 561, pp. 125–140, 2021.
[12]
X. Cui and T. E. Potok, “Document clustering analysis based on hybrid PSO K-means algorithm,” J. Comput. Sci., vol. 27, 2005, Art. no.
[13]
R. Janani and D. Vijayarani, “Text document clustering using spectral clustering algorithm with particle swarm optimization,” Expert Syst. Appl., vol. 134, pp. 192–200, 2019.
[14]
M. Kheirandishfard, F. Zohrizadeh, and F. Kamangar, “Multi-level representation learning for deep subspace clustering,” in Proc. IEEE/CVF Winter Conf. Appl. Comput. Vis., 2020, pp. 2039–2048.
[15]
J. Hartigan, “Direct clustering of a data matrix,” J. Amer. Statist. Assoc., vol. 67, no. 37, pp. 123–129, 1972.
[16]
S. Busygin, O. Prokopyev, and P. M. Pardalos, “Biclustering in data mining,” Comput. Operations Res., vol. 35, no. 9, pp. 2964–2987, 2008.
[17]
I. S. Dhillon, S. Mallela, and D. S. Modha, “Information-theoretic co-clustering,” in Proc. 9th ACM SIGKDD Int. Conf. Knowl. Discov. Data Mining, 2003, pp. 89–98.
[18]
A. Banerjee, I. Dhillon, J. Ghosh, S. Merugu, and D. S. Modha, “A generalized maximum entropy approach to bregman co-clustering and matrix approximation,” J. Mach. Learn. Res., vol. 8, pp. 1919–1986, 2007.
[19]
C. Bloechl, R. A. Amjad, and B. C. Geiger, “Co-clustering via information-theoretic Markov aggregation,” IEEE Trans. Knowl. Data Eng., vol. 31, no. 4, pp. 720–732, Apr. 2019.
[20]
Y. Chen, L. Wang, and M. Dong, “Non-negative matrix factorization for semisupervised heterogeneous data coclustering,” IEEE Trans. Knowl. Data Eng., vol. 22, no. 10, pp. 1459–1474, Oct. 2010.
[21]
S. Huang, Z. Xu, and J. Lv, “Adaptive local structure learning for document co-clustering,” Knowl.-Based Syst., vol. 148, pp. 74–84, 2018.
[22]
J. Wang, X. Wang, G. Yu, C. Domeniconi, Z. Yu, and Z. Zhang, “Discovering multiple co-clusterings with matrix factorization,” IEEE Trans. Cybern., vol. 51, no. 7, pp. 3576–3587, Jul. 2021.
[23]
S. Hess, G. Pio, M. Hochstenbach, and M. Ceci, “BROCCOLI: Overlapping and outlier-robust biclustering through proximal stochastic gradient descent,” Data Mining Knowl. Discov., vol. 35, no. 6, pp. 2542–2576, 2021.
[24]
B. Gao, T.-Y. Liu, X. Zheng, Q.-S. Cheng, and W.-Y. Ma, “Consistent bipartite graph co-partitioning for star-structured high-order heterogeneous data co-clustering,” in Proc. 11th ACM SIGKDD Int. Conf. Knowl. Discov. Data Mining, 2005, pp. 41–50.
[25]
G. Pio, F. Serafino, D. Malerba, and M. Ceci, “Multi-type clustering and classification from heterogeneous networks,” Inf. Sci., vol. 425, pp. 107–126, 2018.
[26]
H. Zha, X. He, C. Ding, H. Simon, and M. Gu, “Bipartite graph partitioning and data clustering,” in Proc. 10th Int. Conf. Inf. Knowl. Manage., 2001, pp. 25–32.
[27]
C. H. Ding, X. He, H. Zha, M. Gu, and H. D. Simon, “A min-max cut algorithm for graph partitioning and data clustering,” in Proc. IEEE Int. Conf. Data Mining, 2001, pp. 107–114.
[28]
B. Charles-Edmond, “Co-clustering documents and words by minimizing the normalized cut objective function,” J. Math. Modelling Algorithms, vol. 9, no. 2, pp. 131–147, 2010.
[29]
K. Rohe, T. Qin, and B. Yu, “Co-clustering directed graphs to discover asymmetries and directional communities,” Proc. Nat. Acad. Sci. USA, vol. 113, no. 45, pp. 12679–12684, 2016.
[30]
K. Song, X. Yao, F. Nie, X. Li, and M. Xu, “Weighted bilateral k-means algorithm for fast co-clustering and fast spectral clustering,” Pattern Recognit., vol. 109, 2021, Art. no.
[31]
S. Huang, H. Wang, T. Li, Y. Yang, and T. Li, “Spectral co-clustering ensemble,” Knowl.-Based Syst., vol. 46, no. 12, pp. 3047–3058, 2015.
[32]
R. K. Merton, “The matthew effect in science: The reward and communication systems of science are considered,” Science, vol. 159, no. 3810, pp. 56–63, 1968.
[33]
Z. Lu, G. Liu, and S. Wang, “Sparse neighbor constrained co-clustering via category consistency learning,” Knowl.-Based Syst., vol. 201, 2020, Art. no.
[34]
P. Deng, T. Li, H. Wang, S.-J. Horng, Z. Yu, and X. Wang, “Tri-regularized nonnegative matrix tri-factorization for co-clustering,” Knowl.-Based Syst., vol. 226, no. 17, 2021, Art. no.
[35]
Q. Gu and J. Zhou, “Co-clustering on manifolds,” in Proc. 15th ACM SIGKDD Int. Conf. Knowl. Discov. Data Mining, 2009, pp. 359–368.
[36]
S. Huang, Z. Xu, Z. Kang, and Y. Ren, “Regularized nonnegative matrix factorization with adaptive local structure learning,” Neurocomputing, vol. 382, pp. 196–209, 2020.
[37]
Y. Zhu, B. Li, and S. Segarra, “Co-clustering vertices and hyperedges via spectral hypergraph partitioning,” in Proc. 29th Eur. Signal Process. Conf., 2021, pp. 1416–1420.
[38]
A. Mirzaeinia and M. Hassanalian, “Minimum-cost drone–nest matching through the kuhn–munkres algorithm in smart cities: Energy management and efficiency enhancement,” Aerospace, vol. 6, no. 11, 2019, Art. no.

Cited By

View all
  • (2024)Co-clustering: A Survey of the Main Methods, Recent Trends, and Open ProblemsACM Computing Surveys10.1145/369887557:2(1-33)Online publication date: 4-Oct-2024
  • (2024)A Survey of Co-ClusteringACM Transactions on Knowledge Discovery from Data10.1145/368179318:9(1-28)Online publication date: 25-Jul-2024
  • (2024)DNSRF: Deep Network-based Semi-NMF Representation FrameworkACM Transactions on Intelligent Systems and Technology10.1145/367040815:5(1-20)Online publication date: 3-Jun-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Knowledge and Data Engineering
IEEE Transactions on Knowledge and Data Engineering  Volume 35, Issue 7
July 2023
1090 pages

Publisher

IEEE Educational Activities Department

United States

Publication History

Published: 01 July 2023

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 02 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Co-clustering: A Survey of the Main Methods, Recent Trends, and Open ProblemsACM Computing Surveys10.1145/369887557:2(1-33)Online publication date: 4-Oct-2024
  • (2024)A Survey of Co-ClusteringACM Transactions on Knowledge Discovery from Data10.1145/368179318:9(1-28)Online publication date: 25-Jul-2024
  • (2024)DNSRF: Deep Network-based Semi-NMF Representation FrameworkACM Transactions on Intelligent Systems and Technology10.1145/367040815:5(1-20)Online publication date: 3-Jun-2024
  • (2024)T-Distributed Stochastic Neighbor Embedding for Co-Representation LearningACM Transactions on Intelligent Systems and Technology10.1145/362782315:2(1-18)Online publication date: 22-Feb-2024
  • (2024)Fast parameterless prototype-based co-clusteringMachine Language10.1007/s10994-023-06474-y113:4(2153-2181)Online publication date: 1-Apr-2024
  • (2023)A Trend of AI Conference Convergence in Similarity: An Empirical Study Through Trans-Temporal Heterogeneous GraphIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.324360235:9(9642-9655)Online publication date: 1-Sep-2023

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media