research-article

ADGNN: Towards Scalable GNN Training with Aggregation-Difference Aware Sampling

Authors:

Christian S. Jensen,

Ge YuAuthors Info & Claims

Proceedings of the ACM on Management of Data, Volume 1, Issue 4

Article No.: 229, Pages 1 - 26

https://doi.org/10.1145/3626716

Published: 12 December 2023 Publication History

Abstract

Distributed computing is promising to enable large-scale graph neural network (GNN) model training. However, care is needed to avoid excessive computational and communication overheads. Sampling is promising in terms of enabling scalability, and sampling techniques have been proposed to reduce training costs. However, online sampling introduces large overheads, and while offline sampling that is done only once can eliminate such overheads, it instead introduces information loss and accuracy degradation. Thus, existing sampling techniques are unable to improve simultaneously both efficiency and accuracy, particularly at low sampling rates. We develop a distributed system, ADGNN, for full-batch based GNN training that adopts a hybrid sampling architecture to enable a trade-off between efficiency and accuracy. Specifically, ADGNN employs sampling result reuse techniques to reduce the cost associated with sampling and thus improve training efficiency. To alleviate accuracy degradation, we introduce a new metric,Aggregation Difference (AD), that quantifies the gap between sampled and full neighbor set aggregation. We present so-called AD-Sampling that aims to minimize the Aggregation Difference with an adaptive sampling frequency tuner. Finally, ADGNN employs anAD -importance-based sampling technique for remote neighbors to further reduce communication costs. Experiments on five real datasets show that ADGNN is able to outperform the state-of-the-art by up to nearly 9 times in terms of efficiency, while achieving comparable accuracy to the non-sampling methods.

References

[1]

Jiyang Bai, Yuxiang Ren, and Jiawei Zhang. 2021. Ripple Walk Training: A Subgraph-based Training Framework for Large and Deep Graph Neural Network. In IJCNN. 1--8.

[2]

Muhammed Fatih Balin and Ü mit V. cC atalyü rek. 2022. (LA)yer-neigh(BOR) Sampling: Defusing Neighborhood Explosion in GNNs. CoRR, Vol. abs/2210.13339 (2022).

[3]

Huiyuan Chen, Chin-Chia Michael Yeh, Fei Wang, and Hao Yang. 2022. Graph Neural Transport Networks with Non-local Attentions for Recommender Systems. In WWW. 1955--1964.

[4]

Jie Chen, Tengfei Ma, and Cao Xiao. 2018a. FastGCN: Fast Learning with Graph Convolutional Networks via Importance Sampling. In ICLR.

[5]

Jianfei Chen, Jun Zhu, and Le Song. 2018b. Stochastic Training of Graph Convolutional Networks with Variance Reduction. In ICML. 941--949.

[6]

Wei-Lin Chiang, Xuanqing Liu, Si Si, Yang Li, Samy Bengio, and Cho-Jui Hsieh. 2019. Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks. In KDD. 257--266.

[7]

Weilin Cong, Rana Forsati, Mahmut T. Kandemir, and Mehrdad Mahdavi. 2020. Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks. In KDD. 1393--1403.

[8]

Hanjun Dai, Zornitsa Kozareva, Bo Dai, Alexander J. Smola, and Le Song. 2018. Learning Steady-States of Iterative Algorithms over Graphs. In ICML, Vol. 80. 1114--1122.

[9]

Michaë l Defferrard, Xavier Bresson, and Pierre Vandergheynst. 2016. Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering. In NeurIPS. 3837--3845.

[10]

Matthias Fey and Jan Eric Lenssen. 2019. Fast Graph Representation Learning with PyTorch Geometric. CoRR, Vol. abs/1903.02428 (2019).

[11]

Chen Gao, Xiang Wang, Xiangnan He, and Yong Li. 2022b. Graph Neural Networks for Recommender System. In WSDM. 1623--1625.

[12]

Yunjun Gao, Xiaoze Liu, Junyang Wu, Tianyi Li, Pengfei Wang, and Lu Chen. 2022a. ClusterEA: Scalable Entity Alignment with Stochastic Training and Normalized Mini-batch Similarities. In KDD. 421--431.

[13]

Yu Gu, Kaiqiang Yu, Zhen Song, Jianzhong Qi, Zhigang Wang, Ge Yu, and Rui Zhang. 2022. Distributed Hypergraph Processing Using Intersection Graphs. IEEE Trans. Knowl. Data Eng., Vol. 34, 7 (2022), 3182--3195.

[14]

William L. Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive Representation Learning on Large Graphs. In NeurIPS. 1024--1034.

[15]

Weihua Hu, Matthias Fey, Marinka Zitnik, Yuxiao Dong, Hongyu Ren, Bowen Liu, Michele Catasta, and Jure Leskovec. 2020. Open Graph Benchmark: Datasets for Machine Learning on Graphs. In NeurIPS, Vol. 33. 22118--22133.

[16]

Wen-bing Huang, Tong Zhang, Yu Rong, and Junzhou Huang. 2018. Adaptive Sampling Towards Fast Graph Representation Learning. In NeurIPS. 4563--4572.

[17]

Zijian Huang, Meng-Fen Chiang, and Wang-Chien Lee. 2022. LinE: Logical Query Reasoning over Hierarchical Knowledge Graphs. In KDD. 615--625.

[18]

Zhihao Jia, Sina Lin, Mingyu Gao, Matei Zaharia, and Alex Aiken. 2020. Improving the accuracy, scalability, and performance of graph neural networks with roc. Proceedings of Machine Learning and Systems, Vol. 2, 187--198.

[19]

George Karypis and Vipin Kumar. 1998. Multilevel k-way Partitioning Scheme for Irregular Graphs. J. Parallel Distributed Comput. (1998), 96--129.

[20]

Yiming Li, Yanyan Shen, Lei Chen, and Mingxuan Yuan. 2023. Orca: Scalable Temporal Graph Neural Network Training with Theoretical Guarantees. Proc. ACM Manag. Data, Vol. 1, 1 (2023), 52:1--52:27.

Digital Library

[21]

Xiaoze Liu, Junyang Wu, Tianyi Li, Lu Chen, and Yunjun Gao. 2023. Unsupervised Entity Alignment for Temporal Knowledge Graphs. In WWW. 2528--2538.

[22]

Lingxiao Ma, Zhi Yang, Youshan Miao, Jilong Xue, Ming Wu, Lidong Zhou, and Yafei Dai. 2019. NeuGraph: Parallel Deep Neural Network Computation on Large Graphs. In ATC. 443--458.

[23]

Qianwen Ma, Chunyuan Yuan, Wei Zhou, and Songlin Hu. 2021. Label-Specific Dual Graph Neural Network for Multi-Label Text Classification. In ACL/IJCNLP. 3855--3864.

[24]

Vasimuddin Md, Sanchit Misra, Guixiang Ma, Ramanarayan Mohanty, Evangelos Georganas, Alexander Heinecke, Dhiraj D. Kalamkar, Nesreen K. Ahmed, and Sasikanth Avancha. 2021. DistGNN: scalable distributed training for large-scale graph neural networks. In SC. 76:1--76:14.

[25]

Yeonhong Park, Sunhong Min, and Jae W. Lee. 2022. Ginex: SSD-enabled Billion-scale Graph Neural Network Training on a Single Machine via Provably Optimal In-memory Caching. Proc. VLDB Endow., Vol. 15, 11 (2022), 2626--2639.

Digital Library

[26]

Jingshu Peng, Zhao Chen, Yingxia Shao, Yanyan Shen, Lei Chen, and Jiannong Cao. 2022. SANCUS: Staleness-Aware Communication-Avoiding Full-Graph Decentralized Training in Large-Scale Graph Neural Networks. Proc. VLDB Endow., Vol. 15, 9 (2022), 1937--1950.

Digital Library

[27]

Jingshu Peng, Yanyan Shen, and Lei Chen. 2021. GraphANGEL: Adaptive aNd Structure-Aware Sampling on Graph NEuraL Networks. In ICDM. 479--488.

[28]

Zhihao Shi, Xize Liang, and Jie Wang. 2023. LMC: Fast Training of GNNs via Subgraph Sampling with Provable Convergence. arXiv preprint arXiv:2302.00924 (2023).

[29]

Zhen Song, Yu Gu, Jianzhong Qi, Zhigang Wang, and Ge Yu. 2022a. EC-Graph: A Distributed Graph Neural Network System with Error-Compensated Compression. In ICDE 2022. 648--660.

[30]

Zhen Song, Yu Gu, Zhigang Wang, and Ge Yu. 2022b. DRPS: efficient disk-resident parameter servers for distributed machine learning. Frontiers Comput. Sci., Vol. 16, 4 (2022), 164321.

Digital Library

[31]

John Thorpe, Yifan Qiao, Jonathan Eyolfson, Shen Teng, Guanzhou Hu, Zhihao Jia, Jinliang Wei, Keval Vora, Ravi Netravali, Miryung Kim, and Guoqing Harry Xu. 2021. Dorylus: Affordable, Scalable, and Accurate GNN Training with Distributed CPU Servers and Serverless Threads. In OSDI. 495--514.

[32]

Alok Tripathy, Katherine A. Yelick, and Aydin Bulucc. 2020. Reducing communication in graph neural network training. In SC. 1--14.

[33]

Petar Velivc ković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2017. Graph attention networks. arXiv preprint arXiv:1710.10903.

[34]

Cheng Wan, Youjie Li, Ang Li, Nam Sung Kim, and Yingyan Lin. 2022b. BNS-GCN: Efficient Full-Graph Training of Graph Convolutional Networks with Partition-Parallelism and Random Boundary Node Sampling. In MLSys. 673--693.

[35]

Xinchen Wan, Kai Chen, and Yiming Zhang. 2022a. DGS: Communication-Efficient Graph Sampling for Distributed GNN Training. In ICNP. 1--11.

[36]

Xinchen Wan, Kaiqiang Xu, Xudong Liao, Yilun Jin, Kai Chen, and Xin Jin. 2023. Scalable and Efficient Full-Graph GNN Training for Large Graphs. Proc. ACM Manag. Data, Vol. 1, 2 (2023), 143:1--143:23.

Digital Library

[37]

Qiange Wang, Yanfeng Zhang, Hao Wang, Chaoyi Chen, Xiaodong Zhang, and Ge Yu. 2022. NeutronStar: Distributed GNN Training with Hybrid Dependency Management. In SIGMOD. 1301--1315.

[38]

Zhigang Wang, Yu Gu, Yubin Bao, Ge Yu, Jeffrey Xu Yu, and Zhiqiang Wei. 2021. HGraph: I/O-Efficient Distributed and Iterative Graph Computing by Hybrid Pushing/Pulling. IEEE Trans. Knowl. Data Eng., Vol. 33, 5 (2021), 1973--1987.

[39]

Junyang Wu, Tianyi Li, Lu Chen, Yunjun Gao, and Ziheng Wei. 2023. SEA: A Scalable Entity Alignment System. In SIGIR. 3175--3179.

[40]

Yirong Wu, Jialin Chen, and Tinglong Tang. 2022. Feature Enhanced Graph Neural Network for Few-Shot Image Classification. In CSCWD. 513--518.

[41]

Yifan Xing, Tong He, Tianjun Xiao, Yongxin Wang, Yuanjun Xiong, Wei Xia, David Wipf, Zheng Zhang, and Stefano Soatto. 2021. Learning Hierarchical Graph Neural Networks for Image Clustering. In ICCV. 3447--3457.

[42]

Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2018. How powerful are graph neural networks? arXiv preprint arXiv:1810.00826.

[43]

Rex Ying, Ruining He, Kaifeng Chen, Pong Eksombatchai, William L. Hamilton, and Jure Leskovec. 2018. Graph Convolutional Neural Networks for Web-Scale Recommender Systems. In KDD. 974--983.

[44]

Hanqing Zeng, Hongkuan Zhou, Ajitesh Srivastava, Rajgopal Kannan, and Viktor Prasanna. 2019a. Graphsaint: Graph sampling based inductive learning method. arXiv preprint arXiv:1907.04931.

[45]

Hanqing Zeng, Hongkuan Zhou, Ajitesh Srivastava, Rajgopal Kannan, and Viktor K. Prasanna. 2019b. Accurate, Efficient and Scalable Graph Embedding. In IPDPS. 462--471.

[46]

Dalong Zhang, Xin Huang, Ziqi Liu, Jun Zhou, Zhiyang Hu, Xianzheng Song, Zhibang Ge, Lin Wang, Zhiqiang Zhang, and Yuan Qi. 2020. AGL: A Scalable System for Industrial-purpose Graph Machine Learning. PVLDB, Vol. 13, 12 (2020), 3125--3137.

[47]

Muhan Zhang and Yixin Chen. 2018. Link Prediction Based on Graph Neural Networks. In NeurIPS. 5171--5181.

[48]

Xin Zhang, Yanyan Shen, and Lei Chen. 2022. Feature-Oriented Sampling for Fast and Scalable GNN Training. In ICDM. 723--732.

[49]

Xin Zhang, Yanyan Shen, Yingxia Shao, and Lei Chen. 2023. DUCATI: A Dual-Cache Training System for Graph Neural Networks on Giant Graphs with the GPU. Proc. ACM Manag. Data, Vol. 1, 2 (2023), 166:1--166:24.

Digital Library

[50]

Da Zheng, Chao Ma, Minjie Wang, Jinjing Zhou, Qidong Su, Xiang Song, Quan Gan, Zheng Zhang, and George Karypis. 2020. DistDGL: Distributed Graph Neural Network Training for Billion-Scale Graphs. In SC. 36--44.

[51]

Jason Zhu, Yanling Cui, Yuming Liu, Hao Sun, Xue Li, Markus Pelger, Tianqi Yang, Liangjie Zhang, Ruofei Zhang, and Huasha Zhao. 2021. TextGNN: Improving Text Encoder via Graph Neural Network in Sponsored Search. In WWW. 2848--2857.

[52]

Rong Zhu, Kun Zhao, Hongxia Yang, Wei Lin, Chang Zhou, Baole Ai, Yong Li, and Jingren Zhou. 2019. AliGraph: A Comprehensive Graph Neural Network Platform. PVLDB, Vol. 12 (2019), 2094--2105.

[53]

Difan Zou, Ziniu Hu, Yewen Wang, Song Jiang, Yizhou Sun, and Quanquan Gu. 2019. Layer-Dependent Importance Sampling for Training Deep and Large Graph Convolutional Networks. In NeurIPS. 11247--11256.

Cited By

Li TZhang XZhao HXu JChang YYang S(2024)A dual-head output network attack detection and classification approach for multi-energy systemsFrontiers in Energy Research10.3389/fenrg.2024.136719912Online publication date: 3-Jul-2024
https://doi.org/10.3389/fenrg.2024.1367199

Index Terms

ADGNN: Towards Scalable GNN Training with Aggregation-Difference Aware Sampling
1. Computing methodologies
  1. Distributed computing methodologies
2. Information systems
  1. Data management systems

Recommendations

The Concept of Stratified Sampling of Execution Traces
ICPC '11: Proceedings of the 2011 IEEE 19th International Conference on Program Comprehension

Execution traces can be overwhelmingly large. To reduce their size, sampling techniques, especially the ones based on random sampling, have been extensively used. Random sampling, however, may result in samples that are not representative of the ...
Stratified sampling of execution traces: Execution phases serving as strata

The understanding of the behavioral aspects of a software system is an important enabler for many reverse engineering activities. The behavior of software is typically represented in the form of execution traces. Traces, however, can be overwhelmingly ...
In situ neighborhood sampling for large-scale GNN training
DaMoN '24: Proceedings of the 20th International Workshop on Data Management on New Hardware

Graph Neural Network (GNN) training algorithms commonly perform neighborhood sampling to construct fixed-size mini-batches for weight aggregation on GPUs. State-of-the-art disk-based GNN frameworks compute sampling on the CPU, transferring edge ...

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Management of Data

Proceedings of the ACM on Management of Data Volume 1, Issue 4

PACMMOD

December 2023

1317 pages

EISSN:2836-6573

DOI:10.1145/3637468

Editor:
Divyakant Agrawal
UC Santa Barbara, United States

Issue’s Table of Contents

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 December 2023

Published in PACMMOD Volume 1, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Author Tags

Qualifiers

Research-article

Funding Sources

the National Natural Science Foundation of China
the Fundamental Research Funds for the Central Universities

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
284
Total Downloads

Downloads (Last 12 months)284
Downloads (Last 6 weeks)45

Reflects downloads up to 26 Jul 2024

Other Metrics

View Author Metrics

Citations

Cited By

Li TZhang XZhao HXu JChang YYang S(2024)A dual-head output network attack detection and classification approach for multi-energy systemsFrontiers in Energy Research10.3389/fenrg.2024.136719912Online publication date: 3-Jul-2024
https://doi.org/10.3389/fenrg.2024.1367199

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents