Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Efficient Maximal Frequent Group Enumeration in Temporal Bipartite Graphs

Published: 30 August 2024 Publication History

Abstract

Cohesive subgraph mining is a fundamental problem in bipartite graph analysis. In reality, relationships between two types of entities often occur at some specific timestamps, which can be modeled as a temporal bipartite graph. However, the temporal information is widely neglected by previous studies. Moreover, directly extending the existing models may fail to find some critical groups in temporal bipartite graphs, which appear in a unilateral (i.e., one-layer) form. To fill the gap, in this paper, we propose a novel model, called maximal λ-frequency group (MFG). Given a temporal bipartite graph 𝒢 = (U, V, ℰ), a vertex set VSV is an MFG if i) there are no less than λ timestamps, at each of which VS can form a (τU, τV)-biclique with some vertices in U at the corresponding snapshot, and ii) it is maximal. To solve the problem, a filter-and-verification (FilterV) method is proposed based on the Bron-Kerbosch framework, incorporating novel filtering techniques to reduce the search space and array-based strategy to accelerate the frequency and maximality verification. Nevertheless, the cost of frequency verification in each valid candidate set computation and maximality check could limit the scalability of FilterV to larger graphs. Therefore, we further develop a novel verification-free (VFree) approach by leveraging the advanced dynamic counting structure proposed. Theoretically, we prove that VFree can reduce the cost of each valid candidate set computation in FilterV by a factor of O(|V|). Furthermore, VFree can avoid the explicit maximality verification because of the developed search paradigm. Finally, comprehensive experiments on 15 real-world graphs are conducted to demonstrate the efficiency and effectiveness of the proposed techniques and model.

References

[1]
Goldberger A, Amaral L, Glass L, Hausdorff J, Ivanov PC, Mark R, Mietus JE, Moody GB, Peng CK, and Stanley HE. 2000. PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation 101 (23) (2000), e215--e220.
[2]
Johnson A, Pollard T, and Mark R. 2016. MIMIC-III Clinical Database (version 1.4). PhysioNet (2016).
[3]
Aman Abidi, Rui Zhou, Lu Chen, and Chengfei Liu. 2020. Pivot-based Maximal Biclique Enumeration. In IJCAI. 3558--3564.
[4]
Gabriela Alexe, Sorin Alexe, Yves Crama, Stephan Foldes, Peter L Hammer, and Bruno Simeone. 2004. Consensus algorithms for the generation of all maximal bicliques. Discrete Applied Mathematics 145, 1 (2004), 11--21.
[5]
Cigdem Aslay, Muhammad Anis Uddin Nasir, Gianmarco De Francisci Morales, and Aristides Gionis. 2018. Mining frequent patterns in evolving graphs. In CIKM. 923--932.
[6]
Furqan Aziz, Victor Roth Cardoso, Laura Bravo-Merodio, Dominic Russ, Samantha C Pendleton, John A Williams, Animesh Acharjee, and Georgios V Gkoutos. 2021. Multimorbidity prediction using link prediction. Scientific Reports 11, 1 (2021), 16392.
[7]
Ali Behrouz, Farnoosh Hashemi, and Laks VS Lakshmanan. 2022. FirmTruss Community Search in Multilayer Networks. Proceedings of the VLDB Endowment 16, 3 (2022), 505--518.
[8]
Alex Beutel, Wanhong Xu, Venkatesan Guruswami, Christopher Palow, and Christos Faloutsos. 2013. Copycatch: stopping group attacks by spotting lockstep behavior in social networks. In WWW. 119--130.
[9]
Xinwei Cai, Xiangyu Ke, Kai Wang, Lu Chen, Tianming Zhang, Qing Liu, and Yunjun Gao. 2024. Efficient Temporal Butterfly Counting and Enumeration on Temporal Bipartite Graphs. Proceedings of the VLDB Endowment 17, 4 (2024), 657--670.
[10]
Chen Chen, Yanping Wu, Renjie Sun, and Xiaoyang Wang. 2023. Maximum Signed θ-Clique Identification in Large Signed Graphs. IEEE Trans. Knowl. Data Eng. 35, 2 (2023), 1791--1802.
[11]
Lu Chen, Chengfei Liu, Rui Zhou, Jiajie Xu, and Jianxin Li. 2022. Efficient maximal biclique enumeration for large sparse bipartite graphs. Proceedings of the VLDB Endowment 15, 8 (2022), 1559--1571.
[12]
Xiaoshuang Chen, Kai Wang, Xuemin Lin, Wenjie Zhang, Lu Qin, and Ying Zhang. 2021. Efficiently answering reachability and path queries on temporal bipartite graphs. Proceedings of the VLDB Endowment 14, 10 (2021), 1845--1858.
[13]
Lingyang Chu, Yanyan Zhang, Yu Yang, Lanjun Wang, and Jian Pei. 2019. Online density bursting subgraph detection from temporal graphs. Proceedings of the VLDB Endowment 12, 13 (2019), 2353--2365.
[14]
Daniel J DiTursi, Gaurav Ghosh, and Petko Bogdanov. 2017. Local community detection in dynamic networks. In ICDM. 847--852.
[15]
Johnson Alistair EW, Pollard Tom J, Shen Lu, Lehman Li wei H, Feng Mengling, Ghassemi Mohammad, Moody Benjamin, Szolovits Peter, Anthony Celi Leo, and Mark Roger G. 2016. MIMIC-III, a freely accessible critical care database. Scientific data 3, 1 (2016), 1--9.
[16]
Alain Gély, Lhouari Nourine, and Bachir Sadi. 2009. Enumeration aspects of maximal cliques and bicliques. Discrete applied mathematics 157, 7 (2009), 1447--1459.
[17]
Farnoosh Hashemi, Ali Behrouz, and Laks VS Lakshmanan. 2022. FirmCore Decomposition of Multilayer Networks. In WWW. 1589--1600.
[18]
Bryan Hooi, Hyun Ah Song, Alex Beutel, Neil Shah, Kijung Shin, and Christos Faloutsos. 2016. Fraudar: Bounding graph fraud in the face of camouflage. In SIGKDD. 895--904.
[19]
Chuntao Jiang, Frans Coenen, and Michele Zito. 2013. A survey of frequent subgraph mining algorithms. The Knowledge Engineering Review 28, 1 (2013), 75--105.
[20]
Junghoon Kim, Kaiyu Feng, Gao Cong, Diwen Zhu, Wenyuan Yu, and Chunyan Miao. 2022. ABC: attributed bipartite co-clustering. Proceedings of the VLDB Endowment 15, 10 (2022), 2134--2147.
[21]
Sanjukta Krishnagopal, Rainer von Coelln, Lisa M Shulman, and Michelle Girvan. 2020. Identifying and predicting Parkinson's disease subtypes through trajectory clustering via bipartite networks. PloS one 15, 6 (2020), e0233296.
[22]
Mo Li, Zhiran Xie, and Linlin Ding. 2023. Persistent Community Search Over Temporal Bipartite Graphs. In ADMA. 324--339.
[23]
Rong-Hua Li, Jiao Su, Lu Qin, Jeffrey Xu Yu, and Qiangqiang Dai. 2018. Persistent community search in temporal networks. In ICDE. 797--808.
[24]
Boge Liu, Long Yuan, Xuemin Lin, Lu Qin, Wenjie Zhang, and Jingren Zhou. 2019. Efficient (α,β)-core Computation: an Index-based Approach. In WWW. 1130--1141.
[25]
Guimei Liu, Kelvin Sim, and Jinyan Li. 2006. Efficient mining of large maximal bicliques. In International Conference on Data Warehousing and Knowledge Discovery. 437--448.
[26]
Muyi Liu and Pan Li. 2022. SATMargin: Practical Maximal Frequent Subgraph Mining via Margin Space Sampling. In WWW. 1495--1505.
[27]
Xuanming Liu, Tingjian Ge, and Yinghui Wu. 2019. Finding densest lasting subgraphs in dynamic graphs: A stochastic approach. In ICDE. 782--793.
[28]
Yunkai Lou, Chaokun Wang, Tiankai Gu, Hao Feng, Jun Chen, and Jeffrey Xu Yu. 2021. Time-topology analysis. Proceedings of the VLDB Endowment 14, 13 (2021), 3322--3334.
[29]
Bingqing Lyu, Lu Qin, Xuemin Lin, Ying Zhang, Zhengping Qian, and Jingren Zhou. 2020. Maximum biclique search at billion scale. Proceedings of the VLDB Endowment 13, 9 (2020), 1359--1372.
[30]
Shuai Ma, Renjun Hu, Luoshu Wang, Xuelian Lin, and Jinpeng Huai. 2017. Fast computation of dense temporal subgraphs. In ICDE. 361--372.
[31]
Hongchao Qin, Rong-Hua Li, Guoren Wang, Xin Huang, Ye Yuan, and Jeffrey Xu Yu. 2020. Mining stable communities in temporal networks by density-based clustering. IEEE Transactions on Big Data 8, 3 (2020), 671--684.
[32]
Hongchao Qin, Rong-Hua Li, Guoren Wang, Lu Qin, Yurong Cheng, and Ye Yuan. 2019. Mining periodic cliques in temporal networks. In ICDE. 1130--1141.
[33]
Hongchao Qin, Rong-Hua Li, Ye Yuan, Guoren Wang, Lu Qin, and Zhiwei Zhang. 2022. Mining Bursting Core in Large Temporal Graphs. Proceedings of the VLDB Endowment 15, 13 (2022), 3911--3923.
[34]
Ingmar Schäfer, Hanna Kaduszkiewicz, Hans-Otto Wagner, Gerhard Schön, Martin Scherer, and Hendrik Van Den Bussche. 2014. Reducing complexity: a visualisation of multimorbidity by combining disease clusters and triads. BMC Public Health 14 (2014), 1--14.
[35]
Renjie Sun, Chen Chen, Xiaoyang Wang, Wenjie Zhang, Ying Zhang, and Xuemin Lin. 2023. Efficient maximum signed biclique identification. In ICDE. 1313--1325.
[36]
Renjie Sun, Yanping Wu, Chen Chen, Xiaoyang Wang, Wenjie Zhang, and Xuemin Lin. 2022. Maximal balanced signed biclique enumeration in signed bipartite graphs. In ICDE. 1887--1899.
[37]
Renjie Sun, Yanping Wu, Xiaoyang Wang, Chen Chen, Wenjie Zhang, and Xuemin Lin. 2024. Efficient Balanced Signed Biclique Search in Signed Bipartite Graphs. IEEE Trans. Knowl. Data Eng. 36, 3 (2024), 1069--1083.
[38]
Yifu Tang, Jianxin Li, Nur Al Hasan Haldar, Ziyu Guan, Jia-Jie Xu, and Chengfei Liu. 2022. Reliable Community Search in Dynamic Networks. Proceedings of the VLDB Endowment 15, 11 (2022), 2826--2838.
[39]
Davide L Vetrano, Albert Roso-Llorach, Sergio Fernández, Marina Guisado-Clavero, Concepción Violán, Graziano Onder, Laura Fratiglioni, Amaia Calderón-Larrañaga, and Alessandra Marengoni. 2020. Twelve-year clinical trajectories of multimorbidity in a population of older adults. Nature Communications 11 (2020), 3223.
[40]
Kai Wang, Xuemin Lin, Lu Qin, Wenjie Zhang, and Ying Zhang. 2022. Towards efficient solutions of bitruss decomposition for large-scale bipartite graphs. The VLDB Journal 31, 2 (2022), 203--226.
[41]
Yiqi Wang, Long Yuan, Zi Chen, Wenjie Zhang, Xuemin Lin, and Qing Liu. 2023. Towards efficient shortest path counting on billion-scale graphs. In ICDE. 2579--2592.
[42]
Yanping Wu, Renjie Sun, Xiaoyang Wang, Dong Wen, Ying Zhang, Lu Qin, and Xuemin Lin. 2024. Efficient Maximal Frequent Group Enumeration in Temporal Bipartite Graphs. arXiv preprint arXiv:2407.03954 (2024).
[43]
Guizhen Yang. 2004. The complexity of mining maximal frequent itemsets and maximal frequent patterns. In SIGKDD. 344--353.
[44]
Guizhen Yang. 2006. Computational aspects of mining maximal frequent patterns. Theoretical Computer Science 362, 1--3 (2006), 63--85.
[45]
Yixing Yang, Yixiang Fang, Maria E Orlowska, Wenjie Zhang, and Xuemin Lin. 2021. Efficient bi-triangle counting for large bipartite networks. Proceedings of the VLDB Endowment 14, 6 (2021), 984--996.
[46]
Yi Yang, Da Yan, Huanhuan Wu, James Cheng, Shuigeng Zhou, and John CS Lui. 2016. Diversified temporal subgraph pattern mining. In SIGKDD. 1965--1974.
[47]
Kai Yao, Lijun Chang, and Jeffrey Xu Yu. 2022. Identifying similar-bicliques in bipartite graphs. Proceedings of the VLDB Endowment 15, 11 (2022), 3085--3097.
[48]
Qianzhen Zhang, Deke Guo, Xiang Zhao, Long Yuan, and Lailong Luo. 2023. Discovering Frequency Bursting Patterns in Temporal Graphs. In ICDE. 599--611.
[49]
Yun Zhang, Charles A Phillips, Gary L Rogers, Erich J Baker, Elissa J Chesler, and Michael A Langston. 2014. On finding bicliques in bipartite graphs: a novel algorithm and its application to the integration of diverse biological data types. BMC bioinformatics 15, 1 (2014), 1--18.
[50]
Qi Zhao, Yingjuan Yang, Guofei Ren, Erxia Ge, and Chunlong Fan. 2019. Integrating bipartite network projection and KATZ measure to identify novel CircRNA-disease associations. IEEE transactions on nanobioscience 18, 4 (2019), 578--584.
[51]
Alexander Zhou, Yue Wang, and Lei Chen. 2021. Butterfly counting on uncertain bipartite graphs. Proceedings of the VLDB Endowment 15, 2 (2021), 211--223.
[52]
Rong Zhu, Zhaonian Zou, and Jianzhong Li. 2018. Diversified coherent core search on multi-layer graphs. In ICDE. 701--712.

Cited By

View all
  • (2024)Keyword-Based Betweenness Centrality Maximization in Attributed GraphsDatabases Theory and Applications10.1007/978-981-96-1242-0_16(209-223)Online publication date: 17-Dec-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 17, Issue 11
July 2024
1039 pages
Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 30 August 2024
Published in PVLDB Volume 17, Issue 11

Check for updates

Badges

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)17
  • Downloads (Last 6 weeks)5
Reflects downloads up to 12 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Keyword-Based Betweenness Centrality Maximization in Attributed GraphsDatabases Theory and Applications10.1007/978-981-96-1242-0_16(209-223)Online publication date: 17-Dec-2024

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media