research-article

Eliminating Data Processing Bottlenecks in GNN Training over Large Graphs via Two-level Feature Compression

Authors:

Yinlong XuAuthors Info & Claims

Proceedings of the VLDB Endowment, Volume 17, Issue 11

Pages 2854 - 2866

https://doi.org/10.14778/3681954.3681968

Published: 30 August 2024 Publication History

Abstract

Training GNNs over large graphs faces a severe data processing bottleneck, involving both sampling and feature loading. To tackle this issue, we introduce F²CGT, a fast GNN training system incorporating feature compression. To avoid potential accuracy degradation, we propose a two-level, hybrid feature compression approach that applies different compression methods to various graph nodes. This differentiated choice strikes a balance between rounding errors, compression ratios, model accuracy loss, and preprocessing costs. Our theoretical analysis proves that this approach offers convergence and comparable model accuracy as the conventional training without feature compression. Additionally, we also co-design the on-GPU cache sub-system with compression-enabled training within F²CGT. The new cache sub-system, driven by a cost model, runs new cache policies to carefully choose graph nodes with high access frequencies, and well partitions the spare GPU memory for various types of graph data, for improving cache hit rates. Finally, extensive evaluation of F²CGT on two popular GNN models and four datasets, including three large public datasets, demonstrates that F²CGT achieves a compression ratio of up to 128 and provides GNN training speedups of 1.23-2.56× and 3.58--71.46× for single-machine and distributed training, respectively, with up to 32 GPUs and marginal accuracy loss.

References

[1]

Sergi Abadal, Akshay Jain, Robert Guirado, Jorge López-Alonso, and Eduard Alarcón. 2021. Computing graph neural networks: A survey from algorithms to accelerators. ACM Computing Surveys (CSUR) 54, 9 (2021), 1--38.

Digital Library

[2]

David M Allen. 1971. Mean square error of prediction as a criterion for selecting variables. Technometrics 13, 3 (1971), 469--475.

[3]

Anonymous. 2024. F²CGT open source code. https://github.com/gpzlx1/F2CGT/. [Online; accessed July-2024].

[4]

Anonymous. 2024. F²CGT supplemental material. https://github.com/gpzlx1/F2CGT-supplemental. [Online; accessed July-2024].

[5]

NVIDIA Corporation. 2023. NVIDIA CUDA Unified Addressing. https://docs.nvidia.com/cuda/cuda-driver-api/group___CUDA__UNIFIED.html. [Online; accessed July-2024].

[6]

Team DGL. 2023. DGL Homepage. https://www.dgl.ai/. [Online; accessed July-2024].

[7]

Mucong Ding, Kezhi Kong, Jingling Li, Chen Zhu, John Dickerson, Furong Huang, and Tom Goldstein. 2021. VQ-GNN: A universal framework to scale up graph neural networks using vector quantization. Advances in Neural Information Processing Systems 34 (2021), 6733--6746.

[8]

Jialin Dong, Da Zheng, Lin F Yang, and George Karypis. 2021. Global neighbor sampling for mixed CPU-GPU training on giant graphs. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 289--299.

Digital Library

[9]

Boyuan Feng, Yuke Wang, Xu Li, Shu Yang, Xueqiao Peng, and Yufei Ding. 2020. Sgquant: Squeezing the last bit on graph neural networks with specialized quantization. In 2020 IEEE 32nd international conference on tools with artificial intelligence (ICTAI). IEEE, 1044--1052.

[10]

Matthias Fey, Jan E Lenssen, Frank Weichert, and Jure Leskovec. 2021. Gnnautoscale: Scalable and expressive graph neural networks via historical embeddings. In International Conference on Machine Learning. PMLR, 3294--3304.

[11]

Chen Gao, Yu Zheng, Nian Li, Yinfeng Li, Yingrong Qin, Jinghua Piao, Yuhan Quan, Jianxin Chang, Depeng Jin, Xiangnan He, et al. 2023. A survey of graph neural networks for recommender systems: Challenges, methods, and directions. ACM Transactions on Recommender Systems 1, 1 (2023), 1--51.

Digital Library

[12]

Ruiqi Gao, Tianle Cai, Haochuan Li, Cho-Jui Hsieh, Liwei Wang, and Jason D Lee. 2019. Convergence of adversarial training in overparametrized neural networks. Advances in Neural Information Processing Systems 32 (2019).

[13]

Ping Gong, Renjie Liu, Zunyao Mao, Zhenkun Cai, Xiao Yan, Cheng Li, Minjie Wang, and Zhuozhao Li. 2023. gSampler: General and Efficient GPU-based Graph Sampling for Graph Learning. In Proceedings of the 29th Symposium on Operating Systems Principles (Koblenz, Germany) (SOSP '23). Association for Computing Machinery, New York, NY, USA, 562--578.

Digital Library

[14]

Yunchao Gong, Liu Liu, Ming Yang, and Lubomir Bourdev. 2014. Compressing deep convolutional networks using vector quantization. arXiv preprint arXiv:1412.6115 (2014).

[15]

Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. Advances in neural information processing systems 30 (2017).

[16]

Jared Heinly, Enrique Dunn, and Jan-Michael Frahm. 2012. Comparative evaluation of binary features. In European Conference on Computer Vision. Springer, 759--773.

[17]

Weihua Hu, Matthias Fey, Hongyu Ren, Maho Nakata, Yuxiao Dong, and Jure Leskovec. 2021. OGB-LSC: A Large-Scale Challenge for Machine Learning on Graphs. arXiv preprint arXiv:2103.09430 (2021).

[18]

Weihua Hu, Matthias Fey, Marinka Zitnik, Yuxiao Dong, Hongyu Ren, Bowen Liu, Michele Catasta, and Jure Leskovec. 2020. Open graph benchmark: Datasets for machine learning on graphs. Advances in neural information processing systems 33 (2020), 22118--22133.

[19]

Abhinav Jangda, Sandeep Polisetty, Arjun Guha, and Marco Serafini. 2021. Accelerating graph sampling for graph machine learning using GPUs. In Proceedings of the sixteenth European conference on computer systems. 311--326.

Digital Library

[20]

George Karypis and Vipin Kumar. 1997. METIS: A software package for partitioning unstructured graphs, partitioning meshes, and computing fill-reducing orderings of sparse matrices. (1997).

[21]

Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In International Conference on Learning Representations.

[22]

Xiao Li, Li Sun, Mengjie Ling, and Yan Peng. 2023. A survey of graph neural network based recommendation in social networks. Neurocomputing 549 (2023), 126441.

Digital Library

[23]

Xiaofan Lin, Cong Zhao, and Wei Pan. 2017. Towards accurate binary convolutional neural network. In Proceedings of the 31st International Conference on Neural Information Processing Systems. 344--352.

Digital Library

[24]

Zhiqi Lin, Cheng Li, Youshan Miao, Yunxin Liu, and Yinlong Xu. 2020. Pagraph: Scaling gnn training on large graphs via computation-aware caching. In Proceedings of the 11th ACM Symposium on Cloud Computing. 401--415.

Digital Library

[25]

Peter Lindstrom. 2014. Fixed-rate compressed floating-point arrays. IEEE transactions on visualization and computer graphics 20, 12 (2014), 2674--2683.

[26]

Peter Lindstrom and Martin Isenburg. 2006. Fast and efficient compression of floating-point data. IEEE transactions on visualization and computer graphics 12, 5 (2006), 1245--1250.

Digital Library

[27]

Shie Mannor, Dori Peleg, and Reuven Rubinstein. 2005. The cross entropy method for classification. In Proceedings of the 22nd international conference on Machine learning. 561--568.

Digital Library

[28]

Santosh Pandey, Lingda Li, Adolfy Hoisie, Xiaoye S. Li, and Hang Liu. 2020. C-SAW: A Framework for Graph Sampling and Random Walk on GPUs. In SC20: International Conference for High Performance Computing, Networking, Storage and Analysis. 1--15.

[29]

Team PyG. 2023. PyTorch Geometric. https://pyg.org/. [Online; accessed July-2024].

[30]

Team PyTorch. 2023. Pytorch Homepage. https://pytorch.org/. [Online; accessed July-2024].

[31]

Manon Réau, Nicolas Renaud, Li C Xue, and Alexandre MJJ Bonvin. 2023. DeepRank-GNN: a graph neural network framework to learn patterns in protein-protein interfaces. Bioinformatics 39, 1 (2023), btac759.

[32]

Pau Rodriguez, Miguel A Bautista, Jordi Gonzàlez, and Sergio Escalera. 2018. Beyond one-hot encoding: Lower dimensional target embedding. Image and Vision Computing 75 (2018), 21--31.

[33]

T. Konstantin Rusch, Michael M. Bronstein, and Siddhartha Mishra. 2023. A Survey on Oversmoothing in Graph Neural Networks. arXiv:2303.10993 [cs.LG]

[34]

Amazon Web Services. 2023. Amazon EC2 G4 Instances. https://aws.amazon.com/ec2/instance-types/g4/. accessed, July-2024.

[35]

Jie Sun, Li Su, Zuocheng Shi, Wenting Shen, Zeke Wang, Lei Wang, Jie Zhang, Yong Li, Wenyuan Yu, Jingren Zhou, and Fei Wu. 2023. Legion: Automatically Pushing the Envelope of Multi-GPU System for Billion-Scale GNN Training. In 2023 USENIX Annual Technical Conference (USENIX ATC 23). USENIX Association, Boston, MA, 165--179. https://www.usenix.org/conference/atc23/presentation/sun

[36]

Jie Sun, Li Su, Zuocheng Shi, Wenting Shen, Zeke Wang, Lei Wang, Jie Zhang, Yong Li, Wenyuan Yu, Jingren Zhou, and Fei Wu. 2023. Legion: Automatically Pushing the Envelope of Multi-GPU System for Billion-Scale GNN Training. In 2023 USENIX Annual Technical Conference (USENIX ATC 23). USENIX Association, Boston, MA, 165--179. https://www.usenix.org/conference/atc23/presentation/sun

[37]

Shyam A Tailor, Javier Fernandez-Marques, and Nicholas D Lane. 2020. Degree-quant: Quantization-aware training for graph neural networks. arXiv preprint arXiv:2008.05000 (2020).

[38]

SNAP Team. 2023. Friendster social network and ground-truth communities. https://snap.stanford.edu/data/com-Friendster.html. accessed, July-2024.

[39]

Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2017. Graph attention networks. arXiv preprint arXiv:1710.10903 (2017).

[40]

Borui Wan, Juntao Zhao, and Chuan Wu. 2023. Adaptive message quantization and parallelization for distributed full-graph gnn training. Proceedings of Machine Learning and Systems 5 (2023).

[41]

Junfu Wang, Yunhong Wang, Zhen Yang, Liang Yang, and Yuanfang Guo. 2021. Bi-gcn: Binary graph convolutional network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1561--1570.

[42]

Kuansan Wang, Zhihong Shen, Chiyuan Huang, Chieh-Han Wu, Yuxiao Dong, and Anshul Kanakia. 2020. Microsoft academic graph: When experts are not enough. Quantitative Science Studies 1, 1 (2020), 396--413.

[43]

Bernard Widrow, Istvan Kollar, and Ming-Chang Liu. 1996. Statistical theory of quantization. IEEE Transactions on instrumentation and measurement 45, 2 (1996), 353--361.

[44]

Jianbang Yang, Dahai Tang, Xiaoniu Song, Lei Wang, Qiang Yin, Rong Chen, Wenyuan Yu, and Jingren Zhou. 2022. GNNLab: a factored system for sample-based GNN training over GPUs. In Proceedings of the Seventeenth European Conference on Computer Systems. 417--434.

Digital Library

[45]

Xin Zhang, Yanyan Shen, Yingxia Shao, and Lei Chen. 2023. DUCATI: A Dual-Cache Training System for Graph Neural Networks on Giant Graphs with the GPU. Proc. ACM Manag. Data 1, 2, Article 166 (jun 2023), 24 pages.

Digital Library

[46]

Zhe Zhang, Ziyue Luo, and Chuan Wu. 2023. Two-level Graph Caching for Expediting Distributed GNN Training. In IEEE INFOCOM 2023 - IEEE Conference on Computer Communications. 1--10.

[47]

D. Zheng, C. Ma, M. Wang, J. Zhou, Q. Su, X. Song, Q. Gan, Z. Zhang, and G. Karypis. 2020. DistDGL: Distributed Graph Neural Network Training for Billion-Scale Graphs. In 2020 IEEE/ACM 10th Workshop on Irregular Applications: Architectures and Algorithms (IA3). IEEE Computer Society, Los Alamitos, CA, USA, 36--44.

[48]

Zeyu Zhu, Fanrong Li, Zitao Mo, Qinghao Hu, Gang Li, Zejian Liu, Xiaoyao Liang, and Jian Cheng. 2023. A2Q: Aggregation-Aware Quantization for Graph Neural Networks. arXiv preprint arXiv:2302.00193 (2023).

Index Terms

Eliminating Data Processing Bottlenecks in GNN Training over Large Graphs via Two-level Feature Compression

Index terms have been assigned to the content through auto-classification.

Recommendations

Auto-Divide GNN: Accelerating GNN Training with Subgraph Division
Euro-Par 2023: Parallel Processing
Abstract
Graph Neural Networks (GNNs) have gained considerable attention in recent years for their exceptional performance on graph-structured data. Sampling-based GNN training is the most common method used for training GNNs on large-scale graphs, and it ...
Lossless Data Compression: Data compression, Algorithm, Lossy compression, Bit rate, ZIP (file format), Unix, Gzip, Portable Network Graphics, Graphics Interchange Format, Tagged Image File Format
Practical prefetching via data compression

An important issue that affects response time performance in current OODB and hypertext systems is the I/O involved in moving objects from slow memory to cache. A promising way to tackle this problem is to use prefetching, in which we predict the user's ...

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment

Proceedings of the VLDB Endowment Volume 17, Issue 11

July 2024

1039 pages

Editors:
Meihui Zhang
Beijing Institute of Technology
,
Cyrus Shahabi
University of Southern California

Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 30 August 2024

Published in PVLDB Volume 17, Issue 11

Check for updates

Badges

Artifacts Available / v1.1

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
176
Total Downloads

Downloads (Last 12 months)176
Downloads (Last 6 weeks)39

Reflects downloads up to 17 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents