Efficient nonparametric and asymptotic Bayesian model selection methods for attributed graph clustering

Xu, Zhiqiang; Cheng, James; Xiao, Xiaokui; Fujimaki, Ryohei; Muraoka, Yusuke

doi:10.1007/s10115-017-1030-8

Efficient nonparametric and asymptotic Bayesian model selection methods for attributed graph clustering

Regular Paper
Published: 16 February 2017

Volume 53, pages 239–268, (2017)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Zhiqiang Xu¹,
James Cheng²,
Xiaokui Xiao³,
Ryohei Fujimaki⁴ &
…
Yusuke Muraoka⁵

518 Accesses
Explore all metrics

Abstract

Attributed graph clustering, also known as community detection on attributed graphs, attracts much interests recently due to the ubiquity of attributed graphs in real life. Many existing algorithms have been proposed for this problem, which are either distance based or model based. However, model selection in attributed graph clustering has not been well addressed, that is, most existing algorithms assume the cluster number to be known a priori. In this paper, we propose two efficient approaches for attributed graph clustering with automatic model selection. The first approach is a popular Bayesian nonparametric method, while the second approach is an asymptotic method based on a recently proposed model selection criterion, factorized information criterion. Experimental results on both synthetic and real datasets demonstrate that our approaches for attributed graph clustering with automatic model selection significantly outperform the state-of-the-art algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A novel method of spectral clustering in attributed networks by constructing parameter-free affinity matrix

Article 17 November 2021

Is the simple assignment enough? Exploring the interpretability for community detection

Article 07 August 2021

Attributed Graph Clustering with Unimodal Normalized Cut

Notes

i.e., we consider only node-attributed graphs throughout the paper.
Non-regular models refer to the models that do not satisfy regularity conditions with BIC [4].
The zero diagonal of ${\mathbf {X}}$ means no self-loops in the corresponding graph while symmetry means that the graph is undirected, in accordance with our focus on undirected simple graphs.
The definition of our clustering requires as less edges as possible between distinct clusters.
Multinomial and Dirichlet distributions are conjugate. As a special case, Bernoulli and Beta distributions are conjugate as well.
The stick-breaking prior is a representation of the Dirichlet process and often used for variational inference. The Dirichlet process here is the distribution of a random probability measure over positive integers.
That is, each prior is a uniform distribution over the components. This is reasonable given that we do not have any prior information on the proportion of different components and thus they are treated equally important.
The corresponding assortativity coefficient is negative, $r=-0.079$.

References

Akoglu L, Tong H, Meeder B, Faloutsos C (2012) Pics: parameter-free identification of cohesive subgroups in large attributed graphs. SDM, pp 439–450
Banerjee B, Bovolo F, Bhattacharya A, Bruzzone L, Chaudhuri S, Mohan BK (2015) A new self-training-based unsupervised satellite image classification technique using cluster ensemble strategy. IEEE Geosci Remote Sens Lett 12(4):741–745
Article Google Scholar
Beal MJ (2003) Variational algorithms for approximate Bayesian inference. PhD thesis, Gatsby Computational Neuroscience Unit, University College London
Bishop CM (2006) Pattern recognition and machine learning (information science and statistics). Springer, Secaucus
MATH Google Scholar
Bothorel C, Cruz JD, Magnani M, Micenková B (2015) Clustering attributed graphs: models, measures and methods. CoRR arXiv:1501.01676
Daudin J-J, Picard F, Robin S (2008) A mixture model for random graphs. Stat Comput 18(2):173–183
Article MathSciNet Google Scholar
Ester M, Ge R, Gao BJ, Hu Z, Ben-Moshe B (2006) Joint cluster analysis of attribute data and relationship data: the connected k-center problem. In: Proceedings of the sixth SIAM international conference on data mining, Bethesda, MD, USA, 20–22 April 2006. pp 246–257. doi:10.1137/1.9781611972764.22
Fujimaki R, Hayashi K (2012) Factorized asymptotic Bayesian hidden Markov models. In: Proceedings of the 29th international conference on machine learning, ICML 2012, Edinburgh, Scotland, UK, 26 June–1 July, 2012
Fujimaki R, Morinaga S (2012) Factorized asymptotic Bayesian inference for mixture modeling. In: Proceedings of the fifteenth international conference on artificial intelligence and statistics, AISTATS 2012, La Palma, Canary Islands, 21–23 April 2012. pp 400–408
Ghahramani Z, Beal MJ (1999) Variational inference for Bayesian mixtures of factor analysers. In: Advances in neural information processing systems 12, NIPS conference, Denver, Colorado, USA, 29 November–4 December, 1999. pp 449–455
Henderson K, Eliassi-Rad T, Papadimitriou S, Faloutsos C (2010) Hcdf: a hybrid community discovery framework. In: Proceedings of the SIAM international conference on data mining, SDM 2010, Columbus, Ohio, USA, 29 April–1 May, 2010. pp 754–765. doi:10.1137/1.9781611972801.66
Henderson K, Gallagher B, Eliassi-Rad T, Tong H, Basu S, Akoglu L, Koutra D, Faloutsos C, Li L (2012) Rolx: structural role extraction & mining in large graphs. In: The 18th ACM SIGKDD international conference on knowledge discovery and data mining, KDD’12, Beijing, China, August 12–16, 2012, pp 1231–1239
Hofmann T (1999) Probabilistic latent semantic indexing. In: SIGIR ’99: Proceedings of the 22nd annual international ACM SIGIR conference on research and development in information retrieval, Berkeley, CA, USA, 15–19 August 1999. pp 50–57. doi:10.1145/312624.312649
Jordan MI, Ghahramani Z, Jaakkola T, Saul LK (1999) An introduction to variational methods for graphical models. Mach Learn 37(2):183–233
Article MATH Google Scholar
Karypis G, Kumar V (1998) A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J Sci Comput 20(1):359–392
Article MathSciNet MATH Google Scholar
Kurihara K, Welling M, Teh YW ( 2007) Collapsed variational Dirichlet process mixture models. In: IJCAI 2007, Proceedings of the 20th international joint conference on artificial intelligence, Hyderabad, India, January 6–12, 2007. pp 2796–2801
Lazarsfeld PF, Henry NW (1968) Latent structure analysis. Houghton Mifflin, Boston
MATH Google Scholar
Lu Z, Sun X, Wen Y, Cao G, Porta TFL (2015) Algorithms and applications for ommunity detection in weighted networks. IEEE Trans Parallel Distrib Syst 26(11):2916–2926
Article Google Scholar
Luo G (2016) A review of automatic selection methods for machine learning algorithms and hyper-parameter values. NetMAHIB 5(1):18. doi:10.1007/s13721-016-0125-6
Miller JW, Harrison MT (2013) A simple example of Dirichlet process mixture inconsistency for the number of components. In: Advances in neural information processing systems, vol 26, pp 199–206
Moser F, Ge R, Ester M (2007) Joint cluster analysis of attribute and relationship data without a-priori specification of the number of clusters. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, San Jose, California, USA, 12–15 August 2007. pp 510–519. doi:10.1145/1281192.1281248
Nallapati R, Ahmed A, Xing EP, Cohen WW (2008) Joint latent topic models for text and citations. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, Las Vegas, Nevada, USA, 24–27 August 2008. pp 542–550. doi:10.1145/1401890.1401957
Newman ME (2002) Assortative mixing in networks. Phys Rev Lett 89(20):208701
Article Google Scholar
Newman MEJ, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69:066113
Article MathSciNet Google Scholar
Ng AY, Jordan MI, Weiss Y ( 2001) On spectral clustering: analysis and an algorithm. In: Advances in neural information processing systems 14 [neural information processing systems: natural and synthetic, NIPS 2001, December 3–8, 2001, Vancouver, British Columbia, Canada], pp 849–856
Nowicki K, Snijders TA (2001) Estimation and prediction for stochastic blockstructures. J Am Stat Assoc 96(455):1077–1087
Article MathSciNet MATH Google Scholar
Papadopoulos A, Rafailidis D, Pallis G, Dikaiakos MD (2015) Clustering attributed multi-graphs with information ranking. In: Database and expert systems applications—26th international conference, DEXA 2015, Valencia, Spain, September 1–4, 2015. Proceedings, Part I, pp 432–446
Semertzidis T, Rafailidis D, Strintzis MG, Daras P (2015) Large-scale spectral clustering based on pairwise constraints. Inf Process Manag 51(5):616–624
Article Google Scholar
Steinhaeuser K, Chawla NV (2008) Community detection in a large real-world social network. In: Social computing, behavioral modeling, and prediction, pp 168–175
Strehl A, Ghosh J (2002) Cluster ensembles—a knowledge reuse framework for combining multiple partitions. J Mach Learn Res 3:583–617
MathSciNet MATH Google Scholar
Sun Y, Aggarwal CC, Han J (2012) Relation strength-aware clustering of heterogeneous information networks with incomplete attributes. PVLDB 5(5):394–405
Google Scholar
Teh YW (2010) Dirichlet process. In: Encyclopedia of machine learning, pp 280–287. doi:10.1007/978-0-387-30164-8_219
Vretos N, Solachidis V, Pitas I (2011) A mutual information based face clustering algorithm for movie content analysis. Image Vis Comput 29(10):693–705
Article Google Scholar
Xu Z, Ke Y (2016) Effective and efficient spectral clustering on text and link data. In: Proceedings of the 25th ACM international on conference on information and knowledge management, CIKM 2016, Indianapolis, IN, USA, October 24–28, 2016, pp 357–366
Xu Z, Ke Y (2016) Stochastic variance reduced Riemannian eigensolver. CoRR arXiv:1605.08233
Xu Z, Ke Y, Wang Y (2014) A fast inference algorithm for stochastic blockmodel. In: 2014 IEEE international conference on data mining, ICDM 2014, Shenzhen, China, December 14–17, 2014, pp 620–629
Xu Z, Ke Y, Wang Y, Cheng H, Cheng J (2012) A model-based approach to attributed graph clustering. In: SIGMOD conference, pp 505–516
Xu Z, Ke Y, Wang Y, Cheng H, Cheng J (2014) GBAGC: a general bayesian framework for attributed graph clustering. TKDD 9(1):5:1–5:43
Article Google Scholar
Xu Z, Zhao P, Cao J, Li X (2016) Matrix eigen-decomposition via doubly stochastic riemannian optimization. In: Proceedings of the 33rd international conference on machine learning, ICML 2016, New York City, NY, USA, June 19–24, 2016, pp 1660–1669
Yang J, McAuley JJ, Leskovec J (2013) Community detection in networks with node attributes. In: IEEE 13th international conference on data mining, Dallas, TX, USA, 7–10 December 2013. pp 1153–1156. doi:10.1109/ICDM.2013.167
Yang T, Jin R, Chi Y, Zhu S (2009) Combining link and content for community detection: a discriminative approach. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, Paris, France, 28 June–1 July, 2009. pp 927–936. doi:10.1145/1557019.1557120
Yu S, Yu K, Tresp V Kriegel H-P (2006) Variational Bayesian Dirichlet-multinomial allocation for exponential family mixtures. In: Machine learning: ECML 2006, 17th European conference on machine learning, Berlin, Germany, 18–22 September 2006. pp 841–848. doi:10.1007/11871842_87
Zanghi H, Volant S, Ambroise C (2010) Clustering based on random graph model embedding vertex features. Pattern Recognit Lett 31(9):830–836
Article Google Scholar
Zhou T, Lü L, Zhang Y (2009) Predicting missing links via local information. Eur Phys J B Condens Matter Complex Syst 71(4):623–630
Article MATH Google Scholar
Zhou Y, Cheng H, Yu JX (2009) Graph clustering based on structural/attribute similarities. PVLDB 2(1):718–729
Google Scholar
Zobay O (2009) Mean field inference for the dirichlet process mixture model. Electron J Stat 3:507–545
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

The authors would like to thank the anonymous reviewers of the paper for their valuable comments that help significantly improve the quality of the paper.

Author information

Authors and Affiliations

Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
Zhiqiang Xu
Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong, China
James Cheng
School of Computer Engineering, Nanyang Technological University, Singapore, Singapore
Xiaokui Xiao
NEC Laboratories America, 10080 North Wolfe Road SW-350, Cupertino, CA, 95014, USA
Ryohei Fujimaki
NEC Laboratories Japan, 1753, Shimonumabe, Nakahara-ku, Kawasaki-shi, Kanagawa, 211-8666, Japan
Yusuke Muraoka

Authors

Zhiqiang Xu
View author publications
You can also search for this author in PubMed Google Scholar
James Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Xiaokui Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Ryohei Fujimaki
View author publications
You can also search for this author in PubMed Google Scholar
Yusuke Muraoka
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhiqiang Xu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xu, Z., Cheng, J., Xiao, X. et al. Efficient nonparametric and asymptotic Bayesian model selection methods for attributed graph clustering. Knowl Inf Syst 53, 239–268 (2017). https://doi.org/10.1007/s10115-017-1030-8

Download citation

Received: 08 October 2015
Revised: 02 February 2017
Accepted: 04 February 2017
Published: 16 February 2017
Issue Date: October 2017
DOI: https://doi.org/10.1007/s10115-017-1030-8

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient nonparametric and asymptotic Bayesian model selection methods for attributed graph clustering

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A novel method of spectral clustering in attributed networks by constructing parameter-free affinity matrix

Is the simple assignment enough? Exploring the interpretability for community detection

Attributed Graph Clustering with Unimodal Normalized Cut

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Efficient nonparametric and asymptotic Bayesian model selection methods for attributed graph clustering

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A novel method of spectral clustering in attributed networks by constructing parameter-free affinity matrix

Is the simple assignment enough? Exploring the interpretability for community detection

Attributed Graph Clustering with Unimodal Normalized Cut

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation