Abstract
Graph clustering is an important unsupervised learning task in complex network analysis and its latest progress mainly relies on a graph autoencoder (GAE) model. However, these methods have three major drawbacks. (1) Most autoencoder models choose graph convolutional networks (GCNs) as their encoders, but the filters and weight matrices in GCN encoders are entangled, which affects the resulting representation performance. (2) Real graphs are often sparse, requiring multiple-layer propagation to generate effective features, but (GCN) encoders are prone to oversmoothing when multiple layers are stacked. (3) Existing methods ignore the distribution of the node features in the feature space during the embedding stage, making their results unsuitable for clustering tasks. To alleviate these problems, in this paper, we propose a novel graph Laplacian autoencoder with subspace clustering regularization for graph clustering (GLASS). Specifically, we first use Laplacian smoothing filters instead of GCNs for feature propagation and multilayer perceptrons (MLPs) for nonlinear transformations, thereby solving the entanglement between convolutional filters and weight matrices. Considering that multilayer propagation is prone to oversmoothing, we further add residual connections between the Laplacian smoothing filters to enhance the multilayer feature propagation capability of GLASS. In addition, to achieve improved clustering performance, we introduce a regular term for subspace clustering to constrain the autoencoder to obtain the node features that are more representative and suitable for clustering. Experiments on node clustering and image clustering using four widely used network datasets and three image datasets show that our method outperforms other existing state-of-the-art methods. In addition, we verify the effectiveness of the proposed method in link prediction, complexity analysis, parameter analysis, data visualization, and ablation studies. The experimental results demonstrate the effectiveness of our proposed GLASS approach, and that it overcomes the shortcomings of GCN encoders to a large extent. This method not only has the advantage of deeper graph encoding but can also adaptively fit the subspace distribution of the given data, which will effectively inspire research on neural networks and autoencoders.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data Availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.
References
Newman MEJ. Communities, modules and large-scale structure in networks. Nat Phys. 2012;8:25–31. https://doi.org/10.1038/nphys2162.
Sumeyye B, Ibrahim Y, Mehmet S, Ibrahim D. Semantic analysis on social networks: a survey. Int J Commun Syst. 2020;33(11):4424.
Jaillet P, Song G, Yu G. Airline network design and hub location problems. Locat Sci. 1996;4(3):195–212. https://doi.org/10.1016/s0966-8349(96)00016-2.
Perozzi B, Al-Rfou R, Skiena S. Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2014. p. 701–10. https://doi.org/10.1145/2623330.2623732.
Grover A, Leskovec J. node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016. p. 855–64. https://doi.org/10.1145/2939672.2939754.
Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q. Line: large-scale information network embedding. In: Proceedings of the 24th International Conference on World Wide Web. 2015. p. 1067–77.
Cao S, Lu W, Xu Q. GraRep: learning graph representations with global structural information. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. 2015. p. 891–900.
Ou M, Cui P, Pei J, Zhang Z, Zhu W. Asymmetric transitivity preserving graph embedding. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016. p. 1105–14. https://doi.org/10.1145/2939672.293975.
Wang X, Cui P, Wang J, Pei J, Zhu W, Yang S. Community preserving network embedding. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence. 2017. p. 203–9.
Kipf TN, Welling M. Variational graph auto-encoders. In: NIPS Workshop on Bayesian Deep Learning. 2016.
Kipf TN, Welling M. Semi-supervised classification with graph convolutional networks. In: Proceedings of the 5th International Conference on Learning Representations. 2017. p. 1–14.
Pan S, Hu R, Long G, Jiang J, Yao L, Zhang C. Adversarially regularized graph autoencoder for graph embedding. 2018. p. 2609–15. https://doi.org/10.24963/ijcai.2018/362.
Wang C, Pan S, Hu R, Long G, Jiang J, Zhang C. Attributed graph clustering: a deep attentional embedding approach. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence. 2019. p. 3670–76. https://doi.org/10.24963/ijcai.2019/509.
Wang C, Pan S, Long G, Zhu X, Jiang J. MGAE: marginalized graph autoencoder for graph clustering. In: Proceedings of ACM on Conference on Information and Knowledge Management. 2017. p. 889–98. https://doi.org/10.1145/3132847.3132967.
Wu F, Souza A, Zhang, T, Fifty C, Yu T, Weinberger K. Simplifying graph convolutional networks. In: Proceedings of the 36th International Conference on Machine Learning. 2019. p. 6861–71.
Li Q, Han Z, Wu X-M. Deeper insights into graph convolutional networks for semi-supervised learning. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 2018. p. 3538–45.
Taubin G. A signal processing approach to fair surface design. In: Proceedings of the 22nd Annual Conference on Computer Graphics and Interactive Techniques. 1995. p. 351–8. https://doi.org/10.1145/218380.218473.
Zhang W, Sheng Z, Jiang Y, Xia Y, Gao J, Yang Z, Cui B. Evaluating deep graph neural networks. arXiv:2108.00955 [Preprint]. 2021. Available from: http://arxiv.org/abs/2108.00955.
Lu C-Y, Min H, Zhao Z-Q, Zhu L, Huang D-S, Yan S. Robust and efficient subspace segmentation via least squares regression. In: European Conference on Computer Vision, vol. 7578. 2012. pp. 347–60. https://doi.org/10.1007/978-3-642-33786-4_26.
Mahmood A, Small M. Subspace based network community detection using sparse linear coding. IEEE Trans Knowl Data Eng. 2015;28(3):801–12. https://doi.org/10.1109/icde.2016.7498395.
Ding Z, Zhang X, Sun D, Luo B. Low-rank subspace learning based network community detection. Knowl-Based Syst. 2018;155:71–82. https://doi.org/10.1016/j.knosys.2018.04.026.
Hamilton WL, Ying R, Leskovec J. Inductive representation learning on large graphs. In: Advances in Neural Information Processing Systems. 2017. p. 1025–35.
Veličković P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y. Graph attention networks. International Conference on Learning Representations. 2018.
Xu K, Hu W, Leskovec J, Jegelka S. How powerful are graph neural networks? International Conference on Learning Representations. 2019.
Wang D, Cui P, Zhu W. Structural deep network embedding. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016. p. 1225–34. https://doi.org/10.1145/2939672.2939753.
Park J, Lee M, Chang HJ, Lee K, Choi JY. Symmetric graph convolutional autoencoder for unsupervised graph representation learning. In: Proceedings of the IEEE International Conference on Computer Vision. 2019. p. 6519–28. https://doi.org/10.1109/iccv.2019.00662.
Cui G, Zhou J, Yang C, Liu Z. Adaptive graph encoder for attributed graph embedding. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2020. p. 976–85. https://doi.org/10.1145/3394486.3403140.
Sun D, Li D, Ding Z, Zhang X, Tang J. Dual-decoder graph autoencoder for unsupervised graph representation learning. Knowl-Based Syst. 2021;234:107564. https://doi.org/10.1016/j.knosys.2021.107564.
Girvan M, Newman ME. Community structure in social and biological networks. Proc Natl Acad Sci. 2002;99(12):7821–6. https://doi.org/10.1073/pnas.122653799.
Clauset A, Newman MEJ, Moore C. Finding community structure in very large networks. Phys Rev E. 2004;70(6):066111. https://doi.org/10.1103/physreve.70.066111.
Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E. Fast unfolding of communities in large networks. J Stat Mech: Theory Exp. 2008;2008(10):10008. https://doi.org/10.1088/1742-5468/2008/10/p10008.
Yang L, Cao X, He D, Wang C, Wang X, Zhang W. Modularity based community detection with deep learning. In: Proceedings of the 25th International Joint Conference on Artificial Intelligence. 2016. p. 2252–8. https://doi.org/10.37099/mtu.dc.etd-restricted/236.
Fortunato S, Barthélemy M. Resolution limit in community detection. Proc Natl Acad Sci. 2007;104(1):36–41. https://doi.org/10.1073/pnas.0605965104.
Jin D, Gabrys B, Dang J. Combined node and link partitions method for finding overlapping communities in complex networks. Sci Rep. 2015;5:8600. https://doi.org/10.1038/srep08600.
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016. p. 770–8. https://doi.org/10.1109/cvpr.2016.90.
Xu K, Li C, Tian Y, Sonobe T, Kawarabayashi K-I, Jegelka S. Representation learning on graphs with jumping knowledge networks. In: Proceedings of the 35th International Conference on Machine Learning, vol. 80. 2018. p. 5453–62.
Rong Y, Huang W, Xu T, Huang J. DropEdge: towards deep graph convolutional networks on node classification. International Conference on Learning Representations. 2019.
Li G, Müller M, Thabet A, Ghanem B. DEEPGCNS: can GCNS go as deep as CNNS? In: Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019. p. 9266–75. https://doi.org/10.1109/iccv.2019.00936.
Chen M, Wei Z, Huang Z, Ding B, Li Y. Simple and deep graph convolutional networks. In: Proceedings of the 37th International Conference on Machine Learning, vol. 119. 2020. p. 1725–35.
Ho J, Yang M-H, Lim J, Lee K-C, Kriegman D. Clustering appearances of objects under varying illumination conditions. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE Computer Society; 2003. https://doi.org/10.1109/cvpr.2003.1211332.
Zhang T, Szlam A, Lerman G. Median k-flats for hybrid linear modeling with many outliers. In: Proceedings of 12th International Conference on Computer Vision Workshops. IEEE; 2009. https://doi.org/10.1109/iccvw.2009.5457695.
Tipping ME, Bishop CM. Mixtures of probabilistic principal component analyzers. Neural Comput. 1999;11(2):443–82. https://doi.org/10.7551/mitpress/3349.003.0012.
Sugaya Y, Kanatani K. Geometric structure of degeneracy for multi-body motion segmentation. In: International Workshop on Statistical Methods in Video Processing. Springer; 2004. p. 13–25. https://doi.org/10.1007/978-3-540-30212-4_2.
Gruber A, Weiss Y. Multibody factorization with uncertainty and missing data using the EM algorithm. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1. IEEE; 2004. p. I–I. https://doi.org/10.1109/cvpr.2004.1315101.
Vidal R, Ma Y, Sastry S. Generalized principal component analysis (GPCA). IEEE Trans Pattern Anal Mach Intell. 2005;27(15):1945–59. https://doi.org/10.1007/978-981-10-2915-8_7.
Ma Y, Yang AY, Derksen H, Fossum R. Estimation of subspace arrangements with applications in modeling and segmenting mixed data. SIAM Rev. 2008;50(3):413–58. https://doi.org/10.1137/060655523.
Yan J, Pollefeys M. A general framework for motion segmentation: independent, articulated, rigid, non-rigid, degenerate and non-degenerate. In: Proceedings of the 9th European Conference on Computer Vision. 2006. p. 94–106. https://doi.org/10.1007/11744085_8.
Goh A, Vidal R. Segmenting motions of different types by unsupervised manifold clustering. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2007. p. 1–6. https://doi.org/10.1109/cvpr.2007.383235.
Elhamifar E, Vidal R. Sparse subspace clustering: algorithm, theory, and applications. IEEE Trans Pattern Anal Mach Intell. 2013;35(11):2765–81. https://doi.org/10.1109/tpami.2013.57.
Soltanolkotabi M, Candes EJ, et al. A geometric analysis of subspace clustering with outliers. Ann Stat. 2012;40(4):2195–238. https://doi.org/10.1214/12-aos1034.
Liu G, Lin Z, Yan S, Sun J, Yu Y, Ma Y. Robust recovery of subspace structures by low-rank representation. IEEE Trans Pattern Anal Mach Intell. 2013;35(1):171–84. https://doi.org/10.1109/tpami.2012.88.
Huang G, Liu Z, van der Maaten L, Weinberger K. Densely connected convolutional networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2017. p. 2261–9. https://doi.org/10.1109/CVPR.2017.243.
Lloyd SP. Least squares quantization in PCM. IEEE Trans Inf Theory. 1982;28(2):129–37. https://doi.org/10.1109/tit.1982.1056489.
Ng A, Jordan M, Weiss Y. On spectral clustering: analysis and an algorithm. In: Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and Synthetic. 2001. p. 849–56.
Cao S, Lu W, Xu Q. Deep neural networks for learning graph representations. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence, vol. 30. 2016. p. 1145–52.
Chang J, Blei D. Relational topic models for document networks. In: Proceedings of the 12th International Conference on Artificial Intelligence and Statistics, vol. 5. 2009. p. 81–8.
Yang C, Liu Z, Zhao D, Sun M, Chang E. Network representation learning with rich text information. In: Proceedings of the 24th International Conference on Artificial Intelligence. 2015. p. 2111–7.
Xia R, Pan Y, Du L, Yin J. Robust multi-view spectral clustering via low-rank and sparse decomposition. In: Proceedings of the 28th AAAI Conference on Artificial Intelligence. 2014. p. 2149–55.
Cai H, Zheng VW, Chang KC-C. A comprehensive survey of graph embedding: problems, techniques, and applications. IEEE Trans Knowl Data Eng. 2018;30(9):1616–37. https://doi.org/10.1109/tkde.2018.2807452.
Goyal P, Ferrara E. Graph embedding techniques, applications, and performance: a survey. Knowl-Based Syst. 2018;151:78–94. https://doi.org/10.1016/j.knosys.2018.03.022.
Kingma DP, Ba J. ADAM: a method for stochastic optimization. International Conference on Learning Representations. 2015.
Van der Maaten L, Hinton G. Visualizing data using T-SNE. J Mach Learn Res. 2008;9:2579–605.
Hu W, Fey M, Zitnik M, Dong Y, Ren H, Liu B, Catasta M, Leskovec J. Open graph benchmark: datasets for machine learning on graphs. In: Advances in Neural Information Processing Systems, vol. 33. 2020. p. 22118–33.
Funding
This study is funded in part by the National Natural Science Foundation of China (Nos. 61906002, 62076005, U20A20398), the Natural Science Foundation of Anhui Province (2008085QF306, 2008085MF191, 2008085UD07), and the University Synergy Innovation Program of Anhui Province, China (GXXT-2021-002).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Ethical Approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Informed Consent
Informed consent was not required as no human or animals were involved.
Conflict of Interest
The authors declare no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sun, D., Liu, L., Luo, B. et al. GLASS: A Graph Laplacian Autoencoder with Subspace Clustering Regularization for Graph Clustering. Cogn Comput 15, 803–821 (2023). https://doi.org/10.1007/s12559-022-10098-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12559-022-10098-0