Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

CommunityAF: An Example-Based Community Search Method via Autoregressive Flow

Published: 01 June 2023 Publication History

Abstract

Example-based community search utilizes hidden patterns of given examples rather than explicit rules, reducing users' burden and enhancing flexibility. However, existing works face challenges such as low scalability, high training cost, and improper termination during the search. Aiming at tackling all these issues, this paper proposes a community search framework named CommunityAF with three well-designed components. The first is a GNN (graph neural network) component that combines community-aware structure features to incrementally learn node embeddings over a large graph for the other two components. The second is an autoregres-sive flow-based generation component designed for fast training and model stability. The third is a scoring component that evaluates the communities and provides scores for a stable termination. Moreover, to show that CommunityAF has the sufficient expressive power to cover the rules, we demonstrate that the scoring component with node features weighted by degree-related factors is able to mimic the existing structure-based community metrics. We introduce a square ranking loss to guide the training of the scoring component, and further devise a flexible termination strategy based on the inferred score change pattern over a sequence of candidate communities using beam search. We compare CommunityAF with four different categories of community search methods on six real-world datasets. The results illustrate that CommunityAF outperforms these community search methods, and achieves an average 15.3% improvement in effectiveness and 4x to 20x speedups on different datasets relative to the state-of-the-art generative method.

References

[1]
Reid Andersen, Fan Chung, and Kevin Lang. 2006. Local Graph Partitioning using PageRank Vectors. In 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06). 475--486.
[2]
A. Bakshi, S. Parthasarathy, and K. Srinivasan. 2018. Semi-Supervised Community Detection Using Structure and Size. In 2018 IEEE International Conference on Data Mining (ICDM).
[3]
Yuchen Bian, Yaowei Yan, Wei Cheng, Wei Wang, Dongsheng Luo, and Xiang Zhang. 2018. On Multi-query Local Community Detection. In 2018 IEEE International Conference on Data Mining (ICDM), Vol. NaN.
[4]
Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Józefowicz, and Samy Bengio. 2016. Generating Sentences from a Continuous Space. In CoNLL. ACL, 10--21.
[5]
Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li. 2007. Learning to rank: from pairwise approach to listwise approach. In Proceedings of the 24th international conference on Machine learning. 129--136.
[6]
Tanmoy Chakraborty, Sikhar Patranabis, Pawan Goyal, and Animesh Mukherjee. 2015. On the Formation of Circles in Co-authorship Networks. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, NSW, Australia, August 10--13, 2015, Longbing Cao, Chengqi Zhang, Thorsten Joachims, Geoffrey I. Webb, Dragos D. Margineantu, and Graham Williams (Eds.). ACM, 109--118.
[7]
Jiazun Chen, Jun Gao, and Bin Cui. 2023. ICS-GNN+: lightweight interactive community search via graph neural network. VLDB J. 32, 2 (2023), 447--467.
[8]
Aaron Clauset. 2005. Finding local community structure in networks. Physical Review E 72, 2 (2005), 026132.
[9]
Wanyun Cui, Yanghua Xiao, Haixun Wang, Yiqi Lu, and Wei Wang. 2013. Online search of overlapping communities. In SIGMOD. 277--288.
[10]
W. Fan, J. Li, M. Shuai, T. Nan, and Y. Wu. 2010. Graph Pattern Matching: From Intractable to Polynomial Time. Proceedings of the VLDB Endowment 3, 1 (2010), 264--275.
[11]
Yixiang Fang, Xin Huang, Lu Qin, Ying Zhang, Wenjie Zhang, Reynold Cheng, and Xuemin Lin. 2020. A survey of community search over big graphs. VLDB J. 29, 1 (2020), 353--392.
[12]
Jun Gao, Jiazun Chen, Zhao Li, and Ji Zhang. 2021. ICS-GNN: Lightweight Interactive Community Search via Graph Neural Network. Proc. VLDB Endow. 14, 6 (2021), 1006--1018.
[13]
Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron C. Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In NIPS. 2672--2680.
[14]
William L. Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive Representation Learning on Large Graphs. In NIPS. 1024--1034.
[15]
Kun He, Yiwei Sun, David Bindel, John Hopcroft, and Yixuan Li. 2015. Detecting Overlapping Communities from Local Spectral Subspaces. In 2015 IEEE International Conference on Data Mining, Vol. NaN.
[16]
Xin Huang, Hong Cheng, Lu Qin, Wentao Tian, and Jeffrey Xu Yu. 2014. Querying k-truss community in large and dynamic graphs. In SIGMOD. 1311--1322.
[17]
Xin Huang and Laks V. S. Lakshmanan. 2017. Attribute-Driven Community Search. Proc. VLDB Endow. 10, 9 (2017), 949--960.
[18]
Xin Huang, Laks V. S. Lakshmanan, Jeffrey Xu Yu, and Hong Cheng. 2015. Approximate Closest Community Search in Networks. Proc. VLDB Endow. 9, 4 (2015), 276--287.
[19]
Yuli Jiang, Yu Rong, Hong Cheng, Xin Huang, Kangfei Zhao, and Junzhou Huang. 2022. Query Driven-Graph Neural Networks for Community Search: From Non-Attributed, Attributed, to Interactive Attributed. Proc. VLDB Endow. 15, 6 (2022), 1243--1255.
[20]
Diederik P. Kingma and Max Welling. 2014. Auto-Encoding Variational Bayes. In ICLR.
[21]
Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In ICLR.
[22]
Johannes Klicpera, Aleksandar Bojchevski, and Stephan Günnemann. 2019. Predict then Propagate: Graph Neural Networks meet Personalized PageRank. In ICLR (Poster). OpenReview.net.
[23]
Kyle Kloster and David F. Gleich. 2014. Heat kernel based community detection. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, Vol. NaN.
[24]
I. Kobyzev, S. Prince, and M. Brubaker. 2020. Normalizing Flows: An Introduction and Review of Current Methods. IEEE Transactions on Pattern Analysis and Machine Intelligence PP, 99 (2020), 1--1.
[25]
Fanzhen Liu, Shan Xue, Jia Wu, Chuan Zhou, Wenbin Hu, Cécile Paris, Surya Nepal, Jian Yang, and Philip S. Yu. 2020. Deep Learning for Community Detection: Progress, Challenges and Opportunities. In IJCAI. 4981--4987.
[26]
Feng Luo, James Wang, and Eric Promislow. 2006. Exploring Local Community Structures in Large Networks. In 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI'06). 233--239.
[27]
Larry R Medsker and LC Jain. 2001. Recurrent neural networks. Design and Applications 5 (2001), 64--67.
[28]
George Papamakarios, Iain Murray, and Theo Pavlakou. 2017. Masked Autoregressive Flow for Density Estimation. In NIPS. 2338--2347.
[29]
Kainan Peng, Wei Ping, Zhao Song, and Kexin Zhao. 2020. Non-Autoregressive Neural Text-to-Speech. In Proceedings of the 37th International Conference on Machine Learning (Proceedings of Machine Learning Research), Hal Daumé III and Aarti Singh (Eds.), Vol. 119. PMLR, 7586--7598. https://proceedings.mlr.press/v119/peng20a.html
[30]
Diederik P. Kingma, Tim Salimans, and Max Welling. 2016. Improving Variational Inference with Inverse Autoregressive Flow. CoRR abs/1606.04934 (2016).
[31]
Mariya Popova, Mykhailo Shvets, Junier Oliva, and Olexandr Isayev. 2019. MolecularRNN: Generating realistic molecular graphs with optimized properties. CoRR abs/1905.13372 (2019).
[32]
Tim Salimans, Ian J. Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. 2016. Improved Techniques for Training GANs. In NIPS. 2226--2234.
[33]
C. Shi, M. Xu, Z. Zhu, W. Zhang, and J. Tang. 2020. GraphAF: a Flow-based Autoregressive Model for Molecular Graph Generation. In ICLR.
[34]
Pan Shi, Kun He, David Bindel, and John E. Hopcroft. 2017. Local Lanczos Spectral Approximation for Community Detection. In ECML/PKDD (1) (Lecture Notes in Computer Science), Vol. 10534. Springer, 651--667.
[35]
Mauro Sozio and Aristides Gionis. 2010. The community-search problem and how to plan a successful cocktail party. In SIGKDD. 939--948.
[36]
Julian R Ullmann. 1976. An algorithm for subgraph isomorphism. Journal of the ACM (JACM) 23, 1 (1976), 31--42.
[37]
Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2018. Graph Attention Networks. In ICLR.
[38]
Jiayi Wang, Chengliang Chai, Jiabin Liu, and Guoliang Li. 2021. FACE: A Normalizing Flow based Cardinality Estimator. Proc. VLDB Endow. 15, 1 (2021), 72--84.
[39]
Jing Wang, Zichen Liu, Shuai Ma, Nikos Ntarmos, and Peter Triantafillou. 2018. GC: A Graph Caching System for Subgraph/Supergraph Queries. Proc. VLDB Endow. 11, 12 (2018), 2022--2025.
[40]
Xiyuan Wang and Muhan Zhang. 2022. GLASS: GNN with Labeling Tricks for Subgraph Representation Learning. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25--29, 2022. OpenReview.net. https://openreview.net/forum?id=XLxhEjKNbXj
[41]
Sam Wiseman and Alexander M. Rush. 2016. Sequence-to-Sequence Learning as Beam-Search Optimization. In EMNLP. The Association for Computational Linguistics, 1296--1306.
[42]
Yubao Wu, Ruoming Jin, Jing Li, and Xiang Zhang. 2015. Robust Local Community Detection: On Free Rider Effect and Its Elimination. Proc. VLDB Endow. 8, 7 (feb 2015), 798--809.
[43]
Jaewon Yang and Jure Leskovec. 2012. Defining and Evaluating Network Communities based on Ground-truth. arXiv:1205.6233 [cs.SI]
[44]
Kai Yao and Lijun Chang. 2021. Efficient Size-Bounded Community Search over Large Networks. Proc. VLDB Endow. 14, 8 (2021), 1441--1453.
[45]
Jiaxuan You, Rex Ying, Xiang Ren, William L. Hamilton, and Jure Leskovec. 2018. GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Models. In ICML (Proceedings of Machine Learning Research), Vol. 80. PMLR, 5694--5703.
[46]
Ye Yuan, Delong Ma, Zhenyu Wen, Zhiwei Zhang, and Guoren Wang. 2021. Subgraph matching over graph federation. Proceedings of the VLDB Endowment 15, 3 (2021), 437--450.
[47]
Jiawei Zhang, Philip S. Yu, and Yuanhua Lv. 2017. Enterprise Employee Training via Project Team Formation. In Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, WSDM 2017, Cambridge, United Kingdom, February 6--10, 2017, Maarten de Rijke, Milad Shokouhi, Andrew Tomkins, and Min Zhang (Eds.). ACM, 3--12.
[48]
Yao Zhang, Yun Xiong, Yun Ye, Tengfei Liu, and Philip S. Yu. 2020. SEAL: Learning Heuristics for Community Detection with Generative Adversarial Networks. In KDD '20.
[49]
Jie Zhou, Ganqu Cui, Zhengyan Zhang, Cheng Yang, Zhiyuan Liu, and Maosong Sun. 2018. Graph Neural Networks: A Review of Methods and Applications. CoRR (2018).

Cited By

View all
  • (2024)FCS-HGNN: Flexible Multi-type Community Search in Heterogeneous Information NetworksProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679696(207-217)Online publication date: 21-Oct-2024
  • (2024)Scalable Community Search over Large-scale Graphs based on Graph TransformerProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657771(1680-1690)Online publication date: 10-Jul-2024
  • (2024)Incorporating Dynamic Temperature Estimation into Contrastive Learning on Graphs2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00224(2889-2903)Online publication date: 13-May-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 16, Issue 10
June 2023
295 pages
ISSN:2150-8097
Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 June 2023
Published in PVLDB Volume 16, Issue 10

Check for updates

Badges

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)62
  • Downloads (Last 6 weeks)2
Reflects downloads up to 08 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)FCS-HGNN: Flexible Multi-type Community Search in Heterogeneous Information NetworksProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679696(207-217)Online publication date: 21-Oct-2024
  • (2024)Scalable Community Search over Large-scale Graphs based on Graph TransformerProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657771(1680-1690)Online publication date: 10-Jul-2024
  • (2024)Incorporating Dynamic Temperature Estimation into Contrastive Learning on Graphs2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00224(2889-2903)Online publication date: 13-May-2024
  • (2024)Self-Training GNN-based Community Search in Large Attributed Heterogeneous Information Networks2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00216(2765-2778)Online publication date: 13-May-2024
  • (2024)Diffusion pattern miningKnowledge and Information Systems10.1007/s10115-024-02254-967:2(1101-1129)Online publication date: 12-Oct-2024

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media