research-article

OUTRE: An OUT-of-Core De-REdundancy GNN Training Framework for Massive Graphs within A Single Machine

Authors:

Bin CuiAuthors Info & Claims

Proceedings of the VLDB Endowment, Volume 17, Issue 11

Pages 2960 - 2973

https://doi.org/10.14778/3681954.3681976

Published: 01 July 2024 Publication History

Abstract

Sampling-based Graph Neural Networks (GNNs) have become the de facto standard for handling various graph learning tasks on large-scale graphs. As the graph size grows larger and even exceeds the standard host memory size of a single machine, out-of-core sampling-based GNN training has gained attention from the community. For out-of-core sampling-based GNN training, the performance bottleneck is the data preparation process that includes sampling neighbor lists and gathering node features from external storage. Based on this observation, existing out-of-core GNN training frameworks try to accomplish larger percentages of data requests without inquiring the external storage by designing better in-memory caches. However, the enormous overall requested data volume is unchanged under this approach. In this paper, we present a new perspective on reducing the overall requested data volume. Through a quantitative analysis, we find that Neighborhood Redundancy and Temporal Redundancy exist in out-of-core sampling-based GNN training. To reduce these two kinds of data redundancies, we propose OUTRE, an OUT-of-core de-REdundancy GNN training framework. OUTRE incorporates two new designs, partition-based batch construction and historical embedding cache, to reduce the corresponding data redundancies. Moreover, we propose automatic cache space management to automatically organize available memory for different caches. Evaluation results on four public large-scale graph datasets show that OUTRE achieves 1.52× to 3.51× speedup against the SOTA framework.

References

[1]

Bushra Alhijawi and Ghazi Al-Naymat. 2022. Novel Positive Multi-Layer Graph Based Method for Collaborative Filtering Recommender Systems. Journal of Computer Science and Technology 37, 4 (2022), 975--990.

Digital Library

[2]

Laszlo A. Belady. 1966. A study of replacement algorithms for a virtual-storage computer. IBM Systems journal 5, 2 (1966), 78--101.

Digital Library

[3]

Burton H Bloom. 1970. Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13, 7 (1970), 422--426.

Digital Library

[4]

Lei Cai, Jundong Li, Jie Wang, and Shuiwang Ji. 2021. Line graph neural networks for link prediction. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 9 (2021), 5103--5113.

[5]

Zhenkun Cai, Xiao Yan, Yidi Wu, Kaihao Ma, James Cheng, and Fan Yu. 2021. DGCL: an efficient communication library for distributed GNN training. In Proceedings of the Sixteenth European Conference on Computer Systems. 130--144.

Digital Library

[6]

Jie Chen, Tengfei Ma, and Cao Xiao. 2018. FastGCN: Fast Learning with Graph Convolutional Networks via Importance Sampling. In International Conference on Learning Representations.

[7]

Jianfei Chen, Jun Zhu, and Le Song. 2018. Stochastic Training of Graph Convolutional Networks with Variance Reduction. In International Conference on Machine Learning. PMLR, 942--950.

[8]

Man-Sheng Chen, Jia-Qi Lin, Xiang-Long Li, Bao-Yu Liu, Chang-Dong Wang, Dong Huang, and Jian-Huang Lai. 2022. Representation learning in multi-view clustering: A literature review. Data Science and Engineering 7, 3 (2022), 225--241.

[9]

Wei-Lin Chiang, Xuanqing Liu, Si Si, Yang Li, Samy Bengio, and Cho-Jui Hsieh. 2019. Cluster-gcn: An efficient algorithm for training deep and large graph convolutional networks. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining. 257--266.

Digital Library

[10]

Leonardo Dagum and Ramesh Menon. 1998. OpenMP: an industry standard API for shared-memory programming. IEEE computational science and engineering 5, 1 (1998), 46--55.

[11]

Shaojie Dai, Yanwei Yu, Hao Fan, and Junyu Dong. 2022. Spatio-temporal representation learning with social tie for personalized POI recommendation. Data Science and Engineering 7, 1 (2022), 44--56.

[12]

Matthias Fey and Jan Eric Lenssen. 2019. Fast graph representation learning with PyTorch Geometric. arXiv preprint arXiv:1903.02428 (2019).

[13]

Matthias Fey, Jan E Lenssen, Frank Weichert, and Jure Leskovec. 2021. Gnnautoscale: Scalable and expressive graph neural networks via historical embeddings. In International conference on machine learning. PMLR, 3294--3304.

[14]

Yasuhiro Fujiwara, Yasutoshi Ida, Atsutoshi Kumagai, Masahiro Nakano, Akisato Kimura, and Naonori Ueda. 2023. Efficient Network representation learning via cluster similarity. Data Science and Engineering 8, 3 (2023), 279--291.

[15]

Swapnil Gandhi and Anand Padmanabha Iyer. 2021. P3: Distributed deep graph learning at scale. In 15th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 21). 551--568.

[16]

Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In NIPS. 1024--1034.

[17]

Haohuai He, Guanxing Chen, and Calvin Yu-Chian Chen. 2024. Integrating sequence and graph information for enhanced drug-target affinity prediction. Science China Information Sciences 67, 2 (2024), 1--2.

[18]

Xiangnan He, Kuan Deng, Xiang Wang, Yan Li, Yongdong Zhang, and Meng Wang. 2020. Lightgcn: Simplifying and powering graph convolution network for recommendation. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval. 639--648.

Digital Library

[19]

Daniel S. Hirschberg, Ashok K. Chandra, and Dilip V. Sarwate. 1979. Computing connected components on parallel computers. Commun. ACM 22, 8 (1979), 461--464.

Digital Library

[20]

Weihua Hu, Matthias Fey, Marinka Zitnik, Yuxiao Dong, Hongyu Ren, Bowen Liu, Michele Catasta, and Jure Leskovec. 2020. Open Graph Benchmark: Datasets for Machine Learning on Graphs. arXiv preprint arXiv:2005.00687 (2020).

[21]

Kezhao Huang, Haitian Jiang, Minjie Wang, Guangxuan Xiao, David Wipf, Xiang Song, Quan Gan, Zengfeng Huang, Jidong Zhai, and Zheng Zhang. 2023. ReFresh: Reducing Memory Access from Exploiting Stable Historical Embeddings for Graph Neural Network Training. arXiv preprint arXiv:2301.07482 (2023).

[22]

Tianhao Huang, Xuhao Chen, Muhua Xu, Arvind Arvind, and Jie Chen. 2023. HierBatching: Locality-Aware Out-of-Core Training of Graph Neural Networks. https://openreview.net/forum?id=WWD_2DKUqdJ

[23]

Zhihao Jia, Sina Lin, Mingyu Gao, Matei Zaharia, and Alex Aiken. 2020. Improving the accuracy, scalability, and performance of graph neural networks with roc. Proceedings of Machine Learning and Systems 2 (2020), 187--198.

[24]

Dejun Jiang, Zhenxing Wu, Chang-Yu Hsieh, Guangyong Chen, Ben Liao, Zhe Wang, Chao Shen, Dongsheng Cao, Jian Wu, and Tingjun Hou. 2021. Could graph neural networks learn better molecular representation for drug discovery? A comparison study of descriptor-based and graph-based models. Journal of cheminformatics 13, 1 (2021), 1--23.

[25]

Taisong Jin, Huaqiang Dai, Liujuan Cao, Baochang Zhang, Feiyue Huang, Yue Gao, and Rongrong Ji. 2022. Deepwalk-aware graph convolutional networks. Science China Information Sciences 65, 5 (2022), 152104.

[26]

George Karypis and Vipin Kumar. 1998. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal on scientific Computing 20, 1 (1998), 359--392.

Digital Library

[27]

Arpandeep Khatua, Vikram Sharma Mailthody, Bhagyashree Taleka, Tengfei Ma, Xiang Song, and Wen-mei Hwu. 2023. IGB: Addressing The Gaps In Labeling, Features, Heterogeneity, and Size of Public Graph Datasets for Deep Learning Research. arXiv preprint arXiv:2302.13522 (2023).

[28]

Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In International Conference on Learning Representations. https://openreview.net/forum?id=SJU4ayYgl

[29]

Jia-Jun Li, Ke Wang, Hao Zheng, and Ahmed Louri. 2023. GShuttle: Optimizing Memory Access Efficiency for Graph Convolutional Neural Network Accelerators. Journal of computer science and technology 38, 1 (2023), 115--127.

Digital Library

[30]

Zhi-Yuan Li, Man-Sheng Chen, Yuefang Gao, and Chang-Dong Wang. 2023. Signal contrastive enhanced graph collaborative filtering for recommendation. Data Science and Engineering 8, 3 (2023), 318--328.

[31]

Xiao-Fei Liao, Wen-Ju Zhao, Hai Jin, Peng-Cheng Yao, Yu Huang, Qing-Gang Wang, Jin Zhao, Long Zheng, Yu Zhang, and Zhi-Yuan Shao. 2024. Towards High-Performance Graph Processing: From a Hardware/Software Co-Design Perspective. Journal of Computer Science and Technology 39, 2 (2024), 245--266.

Digital Library

[32]

Zhiqi Lin, Cheng Li, Youshan Miao, Yunxin Liu, and Yinlong Xu. 2020. Pagraph: Scaling gnn training on large graphs via computation-aware caching. In Proceedings of the 11th ACM Symposium on Cloud Computing. 401--415.

Digital Library

[33]

Xiaojun Ma, Ziyao Li, Guojie Song, and Chuan Shi. 2023. Learning discrete adaptive receptive fields for graph convolutional networks. Science China Information Sciences 66, 12 (2023), 222101.

[34]

Seung Won Min, Kun Wu, Sitao Huang, Mert Hidayetoğlu, Jinjun Xiong, Eiman Ebrahimi, Deming Chen, and Wen-mei Hwu. 2021. Large graph convolutional network training with GPU-oriented data communication architecture. Proceedings of the VLDB Endowment 14, 11 (2021), 2087--2100.

Digital Library

[35]

Sudipta Mondal, Susmita Dey Manasi, Kishor Kunal, Ramprasath S, and Sachin S Sapatnekar. 2022. GNNIE: GNN inference engine with load-balancing and graph-specific caching. In Proceedings of the 59th ACM/IEEE Design Automation Conference. 565--570.

Digital Library

[36]

Thin Nguyen, Hang Le, Thomas P Quinn, Tri Nguyen, Thuc Duy Le, and Svetha Venkatesh. 2021. GraphDTA: Predicting drug-target binding affinity with graph neural networks. Bioinformatics 37, 8 (2021), 1140--1147.

[37]

Lawrence Page, Sergey Brin, Rajeev Motwani, Terry Winograd, et al. 1999. The pagerank citation ranking: Bringing order to the web. (1999).

[38]

Jeongmin Brian Park, Vikram Sharma Mailthody, Zaid Qureshi, and Wenmei Hwu. 2023. Accelerating Sampling and Aggregation Operations in GNN Frameworks with GPU Initiated Direct Storage Accesses. arXiv preprint arXiv:2306.16384 (2023).

[39]

Yeonhong Park, Sunhong Min, and Jae W Lee. 2022. Ginex: SSD-enabled billion-scale graph neural network training on a single machine via provably optimal in-memory caching. Proceedings of the VLDB Endowment 15, 11 (2022), 2626--2639.

Digital Library

[40]

Zaid Qureshi, Vikram Sharma Mailthody, Isaac Gelado, Seungwon Min, Amna Masood, Jeongmin Park, Jinjun Xiong, CJ Newburn, Dmitri Vainbrand, I-Hsin Chung, et al. 2023. GPU-Initiated On-Demand High-Throughput Storage Access in the BaM System Architecture. In Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2. 325--339.

Digital Library

[41]

Tim C Schroeder. 2011. Peer-to-peer & unified virtual addressing. In GPU Technology Conference, NVIDIA.

[42]

Jie Sun, Mo Sun, Zheng Zhang, Jun Xie, Zuocheng Shi, Zihan Yang, Jie Zhang, Fei Wu, and Zeke Wang. 2023. Helios: An Efficient Out-of-core GNN Training System on Terabyte-scale Graphs with In-memory Performance. arXiv preprint arXiv:2310.00837 (2023).

[43]

Charalampos Tsourakakis, Christos Gkantsidis, Bozidar Radunovic, and Milan Vojnovic. 2014. Fennel: Streaming graph partitioning for massive scale graphs. In Proceedings of the 7th ACM international conference on Web search and data mining. 333--342.

Digital Library

[44]

Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2018. Graph Attention Networks. In International Conference on Learning Representations. https://openreview.net/forum?id=rJXMpikCZ

[45]

Keval Vora. 2019. {LUMOS}:{Dependency-Driven} Disk-based Graph Processing. In 2019 USENIX Annual Technical Conference (USENIX ATC 19). 429--442.

[46]

Keval Vora, Guoqing Xu, and Rajiv Gupta. 2016. Load the Edges You Need: A Generic {I/O} Optimization for Disk-based Graph Processing. In 2016 USENIX Annual Technical Conference (USENIX ATC 16). 507--522.

[47]

Roger Waleffe, Jason Mohoney, Theodoros Rekatsinas, and Shivaram Venkataraman. 2023. MariusGNN: Resource-Efficient Out-of-Core Training of Graph Neural Networks. In Proceedings of the Eighteenth European Conference on Computer Systems. 144--161.

Digital Library

[48]

Minjie Wang, Da Zheng, Zihao Ye, Quan Gan, Mufei Li, Xiang Song, Jinjing Zhou, Chao Ma, Lingfan Yu, Yu Gai, et al. 2019. Deep graph library: A graph-centric, highly-performant package for graph neural networks. arXiv preprint arXiv:1909.01315 (2019).

[49]

Qianwen Wang, Kexin Huang, Payal Chandak, Marinka Zitnik, and Nils Gehlenborg. 2022. Extending the nested model for user-centric XAI: A design study on GNN-based drug repurposing. IEEE Transactions on Visualization and Computer Graphics 29, 1 (2022), 1266--1276.

[50]

Haixia Wu, Chunyao Song, Yao Ge, and Tingjian Ge. 2022. Link prediction on complex networks: An experimental survey. Data Science and Engineering 7, 3 (2022), 253--278.

[51]

Shiwen Wu, Fei Sun, Wentao Zhang, Xu Xie, and Bin Cui. 2022. Graph neural networks in recommender systems: a survey. Comput. Surveys 55, 5 (2022), 1--37.

Digital Library

[52]

Riting Xia, Chunxu Zhang, Yan Zhang, Xueyan Liu, and Bo Yang. 2024. A novel graph oversampling framework for node classification in class-imbalanced graphs. Science China Information Sciences 67, 6 (2024), 1--16.

[53]

Shuo Xiao, Dongqing Zhu, Chaogang Tang, and Zhenzhen Huang. 2023. Combining Graph Contrastive Embedding and Multi-head Cross-Attention Transfer for Cross-Domain Recommendation. Data Science and Engineering 8, 3 (2023), 247--262.

[54]

Haiyang Yu, Limei Wang, Bokun Wang, Meng Liu, Tianbao Yang, and Shuiwang Ji. 2022. GraphFM: Improving large-scale GNN training via feature momentum. In International Conference on Machine Learning. PMLR, 25684--25701.

[55]

Hanqing Zeng, Hongkuan Zhou, Ajitesh Srivastava, Rajgopal Kannan, and Viktor Prasanna. 2019. GraphSAINT: Graph Sampling Based Inductive Learning Method. In International Conference on Learning Representations.

[56]

Bingyi Zhang and Viktor Prasanna. 2023. Dynasparse: Accelerating gnn inference through dynamic sparsity exploitation. In 2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 233--244.

[57]

Muhan Zhang and Yixin Chen. 2018. Link prediction based on graph neural networks. Advances in neural information processing systems 31 (2018).

[58]

Wentao Zhang, Ziqi Yin, Zeang Sheng, Yang Li, Wen Ouyang, Xiaosen Li, Yangyu Tao, Zhi Yang, and Bin Cui. 2022. Graph attention multi-layer perceptron. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 4560--4570.

Digital Library

[59]

Zhao-Bo Zhang, Zhi-Man Zhong, Ping-Peng Yuan, and Hai Jin. 2023. Improving entity linking in Chinese domain by sense embedding based on graph clustering. Journal of Computer Science and Technology 38, 1 (2023), 196--210.

Digital Library

[60]

Rong Zhu, Kun Zhao, Hongxia Yang, Wei Lin, Chang Zhou, Baole Ai, Yong Li, and Jingren Zhou. 2019. AliGraph: a comprehensive graph neural network platform. Proceedings of the VLDB Endowment 12, 12 (2019), 2094--2105.

Digital Library

[61]

Xiaowei Zhu, Wentao Han, and Wenguang Chen. 2015. {GridGraph}:{Large-Scale} Graph Processing on a Single Machine Using 2-Level Hierarchical Partitioning. In 2015 USENIX Annual Technical Conference (USENIX ATC 15). 375--386.

Index Terms

OUTRE: An OUT-of-Core De-REdundancy GNN Training Framework for Massive Graphs within A Single Machine

Index terms have been assigned to the content through auto-classification.

Recommendations

Auto-Divide GNN: Accelerating GNN Training with Subgraph Division
Euro-Par 2023: Parallel Processing
Abstract
Graph Neural Networks (GNNs) have gained considerable attention in recent years for their exceptional performance on graph-structured data. Sampling-based GNN training is the most common method used for training GNNs on large-scale graphs, and it ...
Inter-core cooperative TLB for chip multiprocessors
ASPLOS '10

Translation Lookaside Buffers (TLBs) are commonly employed in modern processor designs and have considerable impact on overall system performance. A number of past works have studied TLB designs to lower access times and miss rates, specifically for ...
Decompositions of Graphs into Fans and Single Edges

Given two graphs G and H, an H-decomposition of G is a partition of the edge set of G such that each part is either a single edge or forms a graph isomorphic to H. Let ï źn,H be the smallest number ï ź such that any graph G of order n admits an H-...

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment

Proceedings of the VLDB Endowment Volume 17, Issue 11

July 2024

1039 pages

Editors:
Meihui Zhang
Beijing Institute of Technology
,
Cyrus Shahabi
University of Southern California

Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 July 2024

Published in PVLDB Volume 17, Issue 11

Check for updates

Badges

Artifacts Available / v1.1

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
115
Total Downloads

Downloads (Last 12 months)115
Downloads (Last 6 weeks)15

Reflects downloads up to 08 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents