research-article

Motif Prediction with Graph Neural Networks

Authors:

Cesare Miglioli,

Nicola Bernold,

Grzegorz Kwasniewski,

Raghavendra Kanakagiri,

Saleh Ashkboos,

Lukas Gianinazzi,

Torsten HoeflerAuthors Info & Claims

KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Pages 35 - 45

https://doi.org/10.1145/3534678.3539343

Published: 14 August 2022 Publication History

Abstract

Link prediction is one of the central problems in graph mining. However, recent studies highlight the importance of higher-order network analysis, where complex structures called motifs are the first-class citizens. We first show that existing link prediction schemes fail to effectively predict motifs. To alleviate this, we establish a general motif prediction problem and we propose several heuristics that assess the chances for a specified motif to appear. To make the scores realistic, our heuristics consider - among others - correlations between links, i.e., the potential impact of some arriving links on the appearance of other links in a given motif. Finally, for highest accuracy, we develop a graph neural network (GNN) architecture for motif prediction. Our architecture offers vertex features and sampling schemes that capture the rich structural properties of motifs. While our heuristics are fast and do not need any training, GNNs ensure highest accuracy of predicting motifs, both for dense (e.g., k-cliques) and for sparse ones (e.g., k-stars). We consistently outperform the best available competitor by more than 10% on average and up to 32% in area under the curve. Importantly, the advantages of our approach over schemes based on uncorrelated link prediction increase with the increasing motif size and complexity. We also successfully apply our architecture for predicting more arbitrary clusters and communities, illustrating its potential for graph mining beyond motif analysis.

References

[1]

G. Abuoda et al. Link prediction via higher-order motif features. In ECML PKDD, 2019.

[2]

L. A. Adamic and E. Adar. Friends and neighbors on the web. Social networks, 2003.

[3]

M. Al Hasan et al. Link prediction using supervised learning. In SDM06: workshop on link analysis, counter-terrorism and security, 2006.

[4]

M. Al Hasan and M. J. Zaki. A survey of link prediction in social networks. In Social network data analytics, pages 243--275. Springer, 2011.

[5]

V. Batagelj and A. Mrvar. Pajek datasets, 2006. http://vlado.fmf.uni-lj.si/pub/networks/data/.

[6]

A. R. Benson et al. Higher-order organization of complex networks. Science, 2016.

[7]

A. R. Benson et al. Simplicial closure and higher-order link prediction. PNAS, 2018.

[8]

M. Besta et al. To push or to pull: On reducing communication and synchronization in graph computations. In ACM HPDC, pages 93--104. ACM, 2017.

Digital Library

[9]

M. Besta et al. Slim graph: Practical lossy graph compression for approximate graph processing, storage, and analytics. In ACM/IEEE Supercomputing, pages 1--25, 2019.

[10]

M. Besta et al. Communication-efficient jaccard similarity for high-performance distributed genome comparisons. In IEEE IPDPS, pages 1122--1132. IEEE, 2020.

[11]

M. Besta et al. High-performance parallel graph coloring with strong guarantees on work, depth, and quality. In ACM/IEEE Supercomputing, 2020.

[12]

M. Besta et al. Graphminesuite: Enabling high-performance and programmable graph mining algorithms with set algebra. arXiv preprint arXiv:2103.03653, 2021.

[13]

M. Besta et al. Sisa: Set-centric instruction set architecture for graph mining on processing-inmemory systems. arXiv preprint arXiv:2104.07582, 2021.

[14]

M. Besta et al. Practice of streaming processing of dynamic graphs: Concepts, models, and systems. IEEE TPDS, 2022.

[15]

M. Bhattacharyya and S. Bandyopadhyay. Mining the largest quasi-clique in human protein interactome. In EAIS. IEEE, 2009.

Digital Library

[16]

A. Buluç and J. R. Gilbert. The combinatorial blas: Design, implementation, and applications. IJHPCA, 25(4):496--509, 2011.

[17]

W. Cao et al. A comprehensive survey on geometric deep learning. IEEE Access, 2020.

[18]

D. Chakrabarti and C. Faloutsos. Graph mining: Laws, generators, and algorithms. ACM computing surveys (CSUR), 38(1):2, 2006.

[19]

Z. Chen et al. Bridging the gap between spatial and spectral domains: A survey on graph neural networks. arXiv preprint arXiv:2002.11867, 2020.

[20]

D. J. Cook and L. B. Holder. Mining graph data. John Wiley & Sons, 2006.

[21]

CSCS. Swiss national supercomputing center, 2021. https://cscs.ch.

[22]

M. Fey and J. E. Lenssen. Fast graph representation learning with pytorch geometric. arXiv preprint arXiv:1903.02428, 2019.

[23]

M. Fey and J. E. Lenssen. Fast graph representation learning with pytorch geometric. In ICLR, 2019.

[24]

L. Gianinazzi et al. Communication-avoiding parallel minimum cuts and connected components. In PPoPP, volume 53, pages 219--232. ACM, ACM New York, NY, USA, 2018.

Digital Library

[25]

D. Gibson, R. Kumar, and A. Tomkins. Discovering large dense subgraphs in massive graphs. In VLDB, pages 721--732, 2005.

Digital Library

[26]

A. Grover and J. Leskovec. node2vec: Scalable feature learning for networks. In KDD, 2016.

Digital Library

[27]

T. Horváth, T. Gärtner, and S. Wrobel. Cyclic pattern kernels for predictive graph mining. In KDD, pages 158--167. ACM, 2004.

Digital Library

[28]

Y. Hu et al. Featgraph: A flexible and efficient backend for graph neural network systems. arXiv preprint arXiv:2008.11359, 2020.

[29]

S. Jabbour et al. Pushing the envelope in overlapping communities detection. In IDA, 2018.

[30]

C. Jiang, F. Coenen, and M. Zito. A survey of frequent subgraph mining algorithms. The Knowledge Engineering Review, 28(1):75--105, 2013.

[31]

D. Jurgens and T.-C. Lu. Temporal motifs reveal the dynamics of editor interactions in wikipedia. In AAAI ICWSM, volume 6, 2012.

[32]

L. Katz. A new status index derived from sociometric analysis. Psychometrika, 1953.

[33]

J. Kepner et al. Mathematical foundations of the graphblas. In IEEE HPEC, 2016.

[34]

L. Kovanen et al. Temporal motifs in time-dependent networks. Journal of Statistical Mechanics: Theory and Experiment, 2011(11):P11005, 2011.

[35]

L. Kovanen et al. Temporal motifs. In Temporal networks, pages 119--133. 2013.

[36]

V. E. Lee, N. Ruan, R. Jin, and C. Aggarwal. A survey of algorithms for dense subgraph discovery. In Managing and Mining Graph Data, pages 303--336. Springer, 2010.

[37]

S. Li et al. Pytorch distributed: Experiences on accelerating data parallel training. arXiv preprint arXiv:2006.15704, 2020.

[38]

X.-L. Li et al. Interaction graph mining for protein complexes using local clique merging. Genome Informatics, 2005.

[39]

P. Liu et al. Sampling methods for counting temporal motifs. In WSDM, 2019.

Digital Library

[40]

L. Lü and T. Zhou. Link prediction in complex networks: A survey. Physica A: statistical mechanics and its applications, 390(6):1150--1170, 2011.

[41]

V. Martínez, F. Berzal, and J.-C. Cubero. A survey of link prediction in complex networks. ACM computing surveys (CSUR), 49(4):1--33, 2016.

[42]

P. Moritz et al. Ray: A distributed framework for emerging ai applications. arXiv preprint arXiv:1712.05889, 2017.

Digital Library

[43]

H. Nassar, A. R. Benson, and D. F. Gleich. Pairwise link prediction. In ASONAM, 2019.

Digital Library

[44]

H. Nassar, A. R. Benson, and D. F. Gleich. Neighborhood and pagerank methods for pairwise link prediction. Social Network Analysis and Mining, 2020.

[45]

A. Paranjape, A. R. Benson, and J. Leskovec. Motifs in temporal networks. In WSDM, 2017.

Digital Library

[46]

B. Perozzi, R. Al-Rfou, and S. Skiena. Deepwalk: Online learning of social representations. In KDD, pages 701--710, 2014.

Digital Library

[47]

T. Ramraj and R. Prabhakar. Frequent subgraph mining algorithms-a survey. Procedia Computer Science, 47:197--204, 2015.

[48]

S. Sakr et al. The future is big graphs! a community view on graph processing systems. arXiv preprint arXiv:2012.06171, 2020.

[49]

R. Sato. A survey on the expressive power of graph neural networks. arXiv preprint arXiv:2003.04078, 2020.

[50]

F. Scarselli et al. The graph neural network model. IEEE TNN, 2008.

[51]

B. Taskar et al. Link prediction in relational data. In NeurIPS, 2004.

[52]

K. K. Thekumparampil, C. Wang, S. Oh, and L.-J. Li. Attention-based graph neural network for semi-supervised learning. arXiv preprint arXiv:1803.03735, 2018.

[53]

S. Torkamani and V. Lohweg. Survey on time series motif discovery. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 7(2):e1199, 2017.

[54]

C. Von Mering et al. Comparative assessment of large-scale data sets of protein--protein interactions. Nature, 417(6887):399--403, 2002.

[55]

D. J. Watts and S. H. Strogatz. Collective dynamics of 'small-world'networks. Nature, 1998.

[56]

S.Wu, F. Sun,W. Zhang, and B. Cui. Graph neural networks in recommender systems: a survey. arXiv preprint arXiv:2011.02260, 2020.

[57]

Y. Wu et al. Seastar: vertex-centric programming for graph neural networks. In EuroSys, 2021.

Digital Library

[58]

Z. Wu et al. A comprehensive survey on graph neural networks. IEEE TNNLS, 2020.

[59]

C. Zhang, D. Song, et al. Heterogeneous graph neural network. In KDD, 2019.

Digital Library

[60]

D. Zhang et al. Agl: a scalable system for industrial-purpose graph machine learning. arXiv preprint arXiv:2003.02454, 2020.

[61]

M. Zhang and Y. Chen. Link prediction based on graph neural networks. arXiv preprint arXiv:1802.09691, 2018.

Digital Library

[62]

M. Zhang et al. An end-to-end deep learning architecture for graph classification. In AAAI, 2018.

[63]

M. Zhang, P. Li, Y. Xia, K.Wang, and L. Jin. Revisiting graph neural networks for link prediction. arXiv preprint arXiv:2010.16103, 2020.

[64]

Z. Zhang, P. Cui, and W. Zhu. Deep learning on graphs: A survey. IEEE TKDM, 2020.

[65]

J. Zhou et al. Graph neural networks: A review of methods and applications. AI Open, 2020.

[66]

R. Zhu et al. Aligraph: A comprehensive graph neural network platform. arXiv preprint arXiv:1902.08730, 2019.

Cited By

Li XLiao MWu ZSu DZhang WLi RWang G(2024)LightDiC: A Simple Yet Effective Approach for Large-Scale Digraph Representation LearningProceedings of the VLDB Endowment10.14778/3654621.365462317:7(1542-1551)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.14778/3654621.3654623
Zou TMao YYe JDu BBaeza-Yates RBonchi F(2024)Repeat-Aware Neighbor Sampling for Dynamic Graph LearningProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3672001(4722-4733)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3672001
Qiu LChen LJie HKe XGao YLiu YZhang Z(2024)GPU-Accelerated Batch-Dynamic Subgraph Matching2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00248(3204-3216)Online publication date: 13-May-2024
https://doi.org/10.1109/ICDE60146.2024.00248
Show More Cited By

Index Terms

Motif Prediction with Graph Neural Networks
1. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

Link Prediction via Higher-Order Motif Features
Machine Learning and Knowledge Discovery in Databases
Abstract
Link prediction requires predicting which new links are likely to appear in a graph. In this paper, we present an approach for link prediction that relies on higher-order analysis of the graph topology, well beyond the typical approach which ...
Heterogeneous Line Graph Neural Network for Link Prediction
Advanced Data Mining and Applications
Abstract
Heterogeneous network link prediction is an important network information mining problem. Existing link prediction methods for heterogeneous networks typically require predefined meta-paths with prior knowledge. To address the problem, we propose ...
Conformalized Link Prediction on Graph Neural Networks
KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Graph Neural Networks (GNNs) excel in diverse tasks, yet their applications in high-stakes domains are often hampered by unreliable predictions. Although numerous uncertainty quantification methods have been proposed to address this limitation, they ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 2022

5033 pages

ISBN:9781450393850

DOI:10.1145/3534678

General Chairs:
Aidong Zhang
University of Virginia
,
Huzefa Rangwala
Amazon/George Mason University

Copyright © 2022 ACM.

Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 August 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

KDD '22

Sponsor:

KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 14 - 18, 2022

Washington DC, USA

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

12
Total Citations
View Citations
1,495
Total Downloads

Downloads (Last 12 months)272
Downloads (Last 6 weeks)33

Reflects downloads up to 03 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Li XLiao MWu ZSu DZhang WLi RWang G(2024)LightDiC: A Simple Yet Effective Approach for Large-Scale Digraph Representation LearningProceedings of the VLDB Endowment10.14778/3654621.365462317:7(1542-1551)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.14778/3654621.3654623
Zou TMao YYe JDu BBaeza-Yates RBonchi F(2024)Repeat-Aware Neighbor Sampling for Dynamic Graph LearningProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3672001(4722-4733)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3672001
Qiu LChen LJie HKe XGao YLiu YZhang Z(2024)GPU-Accelerated Batch-Dynamic Subgraph Matching2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00248(3204-3216)Online publication date: 13-May-2024
https://doi.org/10.1109/ICDE60146.2024.00248
Li XWu ZZhang WSun HLi RWang G(2024)AdaFGL: A New Paradigm for Federated Node Classification with Topology Heterogeneity2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00198(2517-2530)Online publication date: 13-May-2024
https://doi.org/10.1109/ICDE60146.2024.00198
Ironside-Smith RNoë BAllen SCostello STurner L(2024)Motif discovery in hospital ward vital signs observation networksNetwork Modeling Analysis in Health Informatics and Bioinformatics10.1007/s13721-024-00490-113:1Online publication date: 7-Oct-2024
https://doi.org/10.1007/s13721-024-00490-1
Li XWu ZZhang WZhu YLi RWang G(2023)FedGTA: Topology-Aware Averaging for Federated Graph LearningProceedings of the VLDB Endowment10.14778/3617838.361784217:1(41-50)Online publication date: 1-Sep-2023
https://dl.acm.org/doi/10.14778/3617838.3617842
Besta MRenc PGerstenberger RSylos Labini PZiogas AChen TGianinazzi LScheidl FSzenes KCarigiet AIff PKwasniewski GKanakagiri RGe CJaeger SWąs JVella FHoefler TMohror KArnold DBadia R(2023)High-Performance and Programmable Attentional Graph Neural Networks with Global Tensor FormulationsProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607067(1-16)Online publication date: 12-Nov-2023
https://dl.acm.org/doi/10.1145/3581784.3607067
Shi CJi HLu ZTang YLi PYang C(2023)Distance Information Improves Heterogeneous Graph Neural NetworksIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.330087936:3(1030-1043)Online publication date: 4-Aug-2023
https://dl.acm.org/doi/10.1109/TKDE.2023.3300879
Liu MLi JYang ZYang K(2023)Higher-Order Functional Structure Exploration in Heterogeneous Combat Network Based on Operational Motif Spectral ClusteringIEEE Systems Journal10.1109/JSYST.2023.329189217:3(4279-4290)Online publication date: Sep-2023
https://doi.org/10.1109/JSYST.2023.3291892
Ding FLuo NYu SWang TXia F(2023)MEGA: Explaining Graph Neural Networks with Network Motifs2023 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN54540.2023.10191684(1-9)Online publication date: 18-Jun-2023
https://doi.org/10.1109/IJCNN54540.2023.10191684
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents