Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Truss-Based Community Search over Streaming Directed Graphs

Published: 31 May 2024 Publication History

Abstract

Community search aims to retrieve dense subgraphs that contain the query vertices. While many effective community models and algorithms have been proposed in the literature, none of them address the unique challenges posed by streaming graphs, where edges are continuously generated over time. In this paper, we investigate the problem of truss-based community search over streaming directed graphs. To address this problem, we first present a peeling-based algorithm that iteratively removes edges that do not meet the support constraints. To improve the efficiency of the peeling-based algorithm, we propose three optimizations that leverage the time information of the streaming graph and the structural information of trusses. As the peeling-based algorithm may suffer from inefficiency when the input peeling graph is large, we further propose a novel order-based algorithm that preserves the community by maintaining the deletion order of edges in the peeling algorithm. Extensive experimental results on real-world datasets show that our proposed algorithms outperform the baseline by up to two orders of magnitude in terms of throughput.

References

[1]
Esra Akbas and Peixiang Zhao. 2017. Truss-based community search: a truss-equivalence based indexing approach. Proceedings of the VLDB Endowment 10, 11 (2017), 1298--1309.
[2]
Soumya Banerjee, Sumit Singh, and Eiman Tamah Al-Shammari. 2018. Community Detection in Social Network: An Experience with Directed Graphs. In Encyclopedia of Social Network Analysis and Mining, 2nd Edition. Springer.
[3]
Jørgen Bang-Jensen and Gregory Z Gutin. 2008. Digraphs: theory, algorithms and applications. Springer Science & Business Media.
[4]
Nicola Barbieri, Francesco Bonchi, Edoardo Galimberti, and Francesco Gullo. 2015. Efficient and effective community search. Data mining and knowledge discovery 29, 5 (2015), 1406--1433.
[5]
Vladimir Batagelj and Matjaz Zaversnik. 2003. An O (m) algorithm for cores decomposition of networks. arXiv preprint cs/0310049 (2003).
[6]
Michael A Bender, Richard Cole, Erik D Demaine, Martin Farach-Colton, and Jack Zito. 2002. Two simplified algorithms for maintaining order in a list. In Algorithms---ESA 2002: 10th Annual European Symposium Rome, Italy, September 17--21, 2002 Proceedings. Springer, 152--164.
[7]
Lijun Chang, Xuemin Lin, Lu Qin, Jeffrey Xu Yu, and Wenjie Zhang. 2015. Index-based optimal algorithms for computing steiner components with maximum connectivity. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. 459--474.
[8]
Lu Chen, Chengfei Liu, Rui Zhou, Jianxin Li, Xiaochun Yang, and Bin Wang. 2018. Maximum co-located community search in large scale social networks. Proceedings of the VLDB Endowment 11, 10 (2018), 1233--1246.
[9]
Shu Chen, Ran Wei, Diana Popova, and Alex Thomo. 2016. Efficient computation of importance based communities in web-scale networks using a single machine. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. 1553--1562.
[10]
Yankai Chen, Jie Zhang, Yixiang Fang, Xin Cao, and Irwin King. 2021. Efficient community search over large directed graphs: An augmented index-based approach. In Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence. 3544--3550.
[11]
Wanyun Cui, Yanghua Xiao, Haixun Wang, Yiqi Lu, and Wei Wang. 2013. Online search of overlapping communities. In Proceedings of the 2013 ACM SIGMOD international conference on Management of data. 277--288.
[12]
Wanyun Cui, Yanghua Xiao, Haixun Wang, and Wei Wang. 2014. Local search of communities in large graphs. In Proceedings of the 2014 ACM SIGMOD international conference on Management of data. 991--1002.
[13]
Mayur Datar, Aristides Gionis, Piotr Indyk, and Rajeev Motwani. 2002. Maintaining stream statistics over sliding windows. SIAM journal on computing 31, 6 (2002), 1794--1813.
[14]
Paul Dietz and Daniel Sleator. 1987. Two algorithms for maintaining order in a list. In Proceedings of the nineteenth annual ACM symposium on Theory of computing. 365--372.
[15]
Yue Ding, Ling Huang, Chang-Dong Wang, and Dong Huang. 2017. Community detection in graph streams by pruning zombie nodes. In Advances in Knowledge Discovery and Data Mining: 21st Pacific-Asia Conference, PAKDD 2017, Jeju, South Korea, May 23--26, 2017, Proceedings, Part I 21. Springer, 574--585.
[16]
Yixiang Fang, Reynold Cheng, Yankai Chen, Siqiang Luo, and Jiafeng Hu. 2017. Effective and efficient attributed community search. The VLDB Journal 26, 6 (2017), 803--828.
[17]
Yixiang Fang, Reynold Cheng, Xiaodong Li, Siqiang Luo, and Jiafeng Hu. 2017. Effective community search over large spatial graphs. Proceedings of the VLDB Endowment 10, 6 (2017), 709--720.
[18]
Yixiang Fang, Xin Huang, Lu Qin, Ying Zhang, Wenjie Zhang, Reynold Cheng, and Xuemin Lin. 2020. A survey of community search over big graphs. The VLDB Journal 29, 1 (2020), 353--392.
[19]
Yixiang Fang, Zheng Wang, Reynold Cheng, Xiaodong Li, Siqiang Luo, Jiafeng Hu, and Xiaojun Chen. 2018. On spatial-aware community search. IEEE Transactions on Knowledge and Data Engineering 31, 4 (2018), 783--798.
[20]
Yixiang Fang, Zhongran Wang, Reynold Cheng, Hongzhi Wang, and Jiafeng Hu. 2018. Effective and efficient community search over large directed graphs. IEEE Transactions on Knowledge and Data Engineering 31, 11 (2018), 2093--2107.
[21]
Christos Giatsidis, Dimitrios M Thilikos, and Michalis Vazirgiannis. 2013. D-cores: measuring collaboration of directed graphs based on degeneracy. Knowledge and information systems 35, 2 (2013), 311--343.
[22]
Xiangyang Gou and Lei Zou. 2021. Sliding window-based approximate triangle counting over streaming graphs with duplicate edges. In Proceedings of the 2021 International Conference on Management of Data. 645--657.
[23]
WafaaMAHabib, HodaMOMokhtar, and Mohamed E El-Sharkawi. 2020. Weight-based k-truss community search via edge attachment. IEEE Access 8 (2020), 148841--148852.
[24]
Jorge E. Hirsch. 2005. H-index. https://en.wikipedia.org/wiki/H-index
[25]
Alexandre Hollocou, Julien Maudet, Thomas Bonald, and Marc Lelarge. 2017. A linear streaming algorithm for community detection in very large networks. arXiv preprint arXiv:1703.02955 (2017).
[26]
Jiafeng Hu, Xiaowei Wu, Reynold Cheng, Siqiang Luo, and Yixiang Fang. 2016. Querying minimal steiner maximum-connected subgraphs in large graphs. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. 1241--1250.
[27]
Jiafeng Hu, Xiaowei Wu, Reynold Cheng, Siqiang Luo, and Yixiang Fang. 2017. On minimal steiner maximum-connected subgraph queries. IEEE Transactions on Knowledge and Data Engineering 29, 11 (2017), 2455--2469.
[28]
Xin Huang, Hong Cheng, Lu Qin, Wentao Tian, and Jeffrey Xu Yu. 2014. Querying k-truss community in large and dynamic graphs. In Proceedings of the 2014 International Conference on Management of Data. 1311--1322.
[29]
Xin Huang and Laks VS Lakshmanan. 2017. Attribute-driven community search. Proceedings of the VLDB Endowment 10, 9 (2017), 949--960.
[30]
Xin Huang, Laks VS Lakshmanan, and Jianliang Xu. 2019. Community search over big graphs. Synthesis Lectures on Data Management 14, 6 (2019), 1--206.
[31]
Xin Huang, Laks VS Lakshmanan, Jeffrey Xu Yu, and Hong Cheng. 2015. Approximate closest community search in networks. Proceedings of the VLDB Endowment 9, 4 (2015), 276--287.
[32]
Paul Irofti, Andrei Patrascu, and Andra Baltoiu. 2019. Quick survey of graph-based fraud detection methods. arXiv preprint arXiv:1910.11299 (2019).
[33]
Meng Jiang, Peng Cui, Alex Beutel, Christos Faloutsos, and Shiqiang Yang. 2014. Catchsync: catching synchronized behavior in large directed graphs. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. 941--950.
[34]
Bogyeong Kim, Kyoseung Koo, Undraa Enkhbat, and Bongki Moon. 2022. DenForest: Enabling Fast Deletion in Incremental Density-Based Clustering over Sliding Windows. In Proceedings of the 2022 International Conference on Management of Data. 296--309.
[35]
Bogyeong Kim, Kyoseung Koo, Undraa Enkhbat, and Bongki Moon. 2022. DenForest: Enabling Fast Deletion in Incremental Density-Based Clustering over Sliding Windows. In Proceedings of the 2022 International Conference on Management of Data. 296--309.
[36]
Matthieu Latapy. 2008. Main-memory triangle computations for very large (sparse (power-law)) graphs. Theoretical computer science 407, 1--3 (2008), 458--473.
[37]
Jianxin Li, Xinjue Wang, Ke Deng, Xiaochun Yang, Timos Sellis, and Jeffrey Xu Yu. 2017. Most influential community search over large social networks. In 2017 IEEE 33rd international conference on data engineering (ICDE). IEEE, 871--882.
[38]
Rong-Hua Li, Lu Qin, Jeffrey Xu Yu, and Rui Mao. 2015. Influential community search in large networks. Proceedings of the VLDB Endowment 8, 5 (2015), 509--520.
[39]
Rong-Hua Li, Lu Qin, Jeffrey Xu Yu, and Rui Mao. 2017. Finding influential communities in massive networks. The VLDB Journal 26, 6 (2017), 751--776.
[40]
Rong-Hua Li, Jiao Su, Lu Qin, Jeffrey Xu Yu, and Qiangqiang Dai. 2018. Persistent community search in temporal networks. In 2018 IEEE 34th International Conference on Data Engineering (ICDE). IEEE, 797--808.
[41]
Panagiotis Liakos, Katia Papakonstantinopoulou, Alexandros Ntoulas, and Alex Delis. 2020. Rapid detection of local communities in graph streams. IEEE Transactions on Knowledge and Data Engineering 34, 5 (2020), 2375--2386.
[42]
Xuankun Liao, Qing Liu, Jiaxin Jiang, Xin Huang, Jianliang Xu, and Byron Choi. 2022. Distributed D-Core Decomposition over Large Directed Graphs. Proceedings of the VLDB Endowment (2022), 1546--1558.
[43]
Qing Liu, Minjun Zhao, Xin Huang, Jianliang Xu, and Yunjun Gao. 2020. Truss-based community search over large directed graphs. In Proceedings of the 2020 ACM International Conference on Management of Data. 2183--2197.
[44]
Qing Liu, Xuliang Zhu, Xin Huang, and Jianliang Xu. 2021. Local algorithms for distance-generalized core decomposition over large dynamic graphs. Proceedings of the VLDB Endowment 14, 9 (2021), 1531--1543.
[45]
Qing Liu, Yifan Zhu, Minjun Zhao, Xin Huang, Jianliang Xu, and Yunjun Gao. 2020. VAC: vertex-centric attributed community search. In 2020 IEEE 36th International Conference on Data Engineering (ICDE). IEEE, 937--948.
[46]
Anil Pacaci, Angela Bonifati, and M. Tamer Özsu. 2020. Regular Path Query Evaluation on Streaming Graphs. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 1415--1430.
[47]
Aida Sheshbolouki and M Tamer Özsu. 2022. sGrapp: Butterfly approximation in streaming graphs. ACM Transactions on Knowledge Discovery from Data (TKDD) 16, 4 (2022), 1--43.
[48]
Mauro Sozio and Aristides Gionis. 2010. The community-search problem and how to plan a successful cocktail party. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. 939--948.
[49]
Zitan Sun, Xin Huang, Qing Liu, and Jianliang Xu. 2023. Efficient Star-based Truss Maintenance on Dynamic Graphs. Proceedings of the ACM on Management of Data 1, 2 (2023), 1--26.
[50]
Anxin Tian, Alexander Zhou, Yue Wang, and Lei Chen. 2022. Maximal D-truss Search in Dynamic Directed Graphs. Proceedings of the VLDB Endowment 16, 9 (2022), 2199--2211.
[51]
Chang-Dong Wang, Jian-Huang Lai, and Philip S Yu. 2013. Dynamic community detection in weighted graph streams. In Proceedings of the 2013 SIAM international conference on data mining. SIAM, 151--161.
[52]
Di Yang, Elke A Rundensteiner, and Matthew O Ward. 2009. Neighbor-based pattern detection for windows over streaming data. In Proceedings of the 12th international conference on extending database technology: advances in database technology. 529--540.
[53]
Di Yang, Elke A Rundensteiner, and Matthew O Ward. 2009. Neighbor-based pattern detection for windows over streaming data. In Proceedings of the 12th international conference on extending database technology: advances in database technology. 529--540.
[54]
Long Yuan, Lu Qin, Wenjie Zhang, Lijun Chang, and Jianye Yang. 2017. Index-based densest clique percolation community search in networks. IEEE Transactions on Knowledge and Data Engineering 30, 5 (2017), 922--935.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 17, Issue 8
April 2024
335 pages
Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 31 May 2024
Published in PVLDB Volume 17, Issue 8

Check for updates

Badges

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)110
  • Downloads (Last 6 weeks)18
Reflects downloads up to 12 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media