research-article

SIMPLE: Efficient Temporal Graph Neural Network Training at Scale with Dynamic Data Placement

Authors:

Lei ChenAuthors Info & Claims

Proceedings of the ACM on Management of Data, Volume 2, Issue 3

Article No.: 174, Pages 1 - 25

https://doi.org/10.1145/3654977

Published: 30 May 2024 Publication History

Abstract

Dynamic graphs are essential in real-world scenarios like social media and e-commerce for tasks such as predicting links and classifying nodes. Temporal Graph Neural Networks (T-GNNs) stand out as a prime solution for managing dynamic graphs, employing temporal message passing to compute node embeddings at specific timestamps. Nonetheless, the high CPU-GPU data loading overhead has become the bottleneck for efficient training of T-GNNs over large-scale dynamic graphs. In this work, we present SIMPLE, a versatile system designed to address the major efficiency bottleneck in training existing T-GNNs on a large scale. It incorporates a dynamic data placement mechanism, which maintains a small buffer space in available GPU memory and dynamically manages its content during T-GNN training. SIMPLE is also empowered by systematic optimizations towards data processing flow. We compare SIMPLE to the state-of-the-art generic T-GNN training system TGL on four large-scale dynamic graphs with different underlying T-GNN models. Extensive experimental results show that SIMPLE effectively cuts down 80.5% ~ 96.8% data loading cost, and accelerates T-GNN training by 1.8× ~ 3.8× (2.6× on average) compared to TGL.

References

[1]

[2023]. Stack-Overflow. https://snap.stanford.edu/data/sx-stackoverflow.html

[2]

[2023]. Wiki-Talk. http://snap.stanford.edu/data/wiki-talk-temporal.html

[3]

Valentina Cacchiani, Manuel Iori, Alberto Locatelli, and Silvano Martello. 2022. Knapsack problems-An overview of recent advances. Part II: Multiple, multidimensional, and quadratic knapsack problems. Computers & Operations Research, Vol. 143 (2022), 105693.

Digital Library

[4]

Venkatesan T Chakaravarthy, Shivmaran S Pandian, Saurabh Raje, Yogish Sabharwal, Toyotaro Suzumura, and Shashanka Ubaru. 2021. Efficient scaling of dynamic graph neural networks. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 1--15.

Digital Library

[5]

Weilin Cong, Si Zhang, Jian Kang, Baichuan Yuan, Hao Wu, Xin Zhou, Hanghang Tong, and Mehrdad Mahdavi. 2023. Do We Really Need Complicated Model Architectures For Temporal Networks?. In International Conference on Learning Representations.

[6]

Arnaud Fréville. 2004. The multidimensional 0--1 knapsack problem: An overview. European Journal of Operational Research, Vol. 155, 1 (2004), 1--21.

[7]

Shihong Gao, Yiming Li, Yanyan Shen, Yingxia Shao, and Lei Chen. 2024. ETC: Efficient Training of Temporal Graph Neural Networks over Large-scale Dynamic Graphs. Proceedings of the VLDB Endowment, Vol. 17, 5 (2024), 1060--1072.

Digital Library

[8]

Palash Goyal, Sujit Rokka Chhetri, and Arquimedes Canedo. 2020. dyngraph2vec: Capturing network dynamics using dynamic graph representation learning. Knowledge-Based Systems, Vol. 187 (2020), 104816.

[9]

Mingyu Guan, Anand Padmanabha Iyer, and Taesoo Kim. 2022. DynaGraph: dynamic graph neural networks at scale. In Proceedings of the 5th ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA). 1--10.

Digital Library

[10]

Ehsan Hajiramezanali, Arman Hasanzadeh, Krishna Narayanan, Nick Duffield, Mingyuan Zhou, and Xiaoning Qian. 2019. Variational graph recurrent neural networks. Advances in neural information processing systems, Vol. 32 (2019).

[11]

Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. Advances in neural information processing systems, Vol. 30 (2017).

[12]

Ming Jin, Yuan-Fang Li, and Shirui Pan. 2022. Neural Temporal Walks: Motif-Aware Representation Learning on Continuous-Time Dynamic Graphs. In Advances in Neural Information Processing Systems.

[13]

Thomas N Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In International Conference on Learning Representations.

[14]

Srijan Kumar, Xikun Zhang, and Jure Leskovec. 2019. Predicting dynamic embedding trajectory in temporal interaction networks. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining. 1269--1278.

Digital Library

[15]

Siu Kwan Lam, Antoine Pitrou, and Stanley Seibert. 2015. Numba: A llvm-based python jit compiler. In Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC. 1--6.

Digital Library

[16]

Donghee Lee, Jongmoo Choi, Jong-Hun Kim, Sam H Noh, Sang Lyul Min, Yookun Cho, and Chong Sang Kim. 2001. LRFU: A spectrum of policies that subsumes the least recently used and least frequently used policies. IEEE transactions on Computers, Vol. 50, 12 (2001), 1352--1361.

[17]

Kalev Leetaru and Philip A Schrodt. 2013. Gdelt: Global data on events, location, and tone, 1979--2012. In ISA annual convention, Vol. 2. Citeseer, 1--49.

[18]

Guoliang Li, Xuanhe Zhou, and Lei Cao. 2021. AI meets database: AI4DB and DB4AI. In Proceedings of the 2021 International Conference on Management of Data. 2859--2866.

Digital Library

[19]

Haoyang Li and Lei Chen. 2021. Cache-based gnn system for dynamic graphs. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 937--946.

Digital Library

[20]

Yiming Li, Yanyan Shen, Lei Chen, and Mingxuan Yuan. 2023 a. Orca: Scalable Temporal Graph Neural Network Training with Theoretical Guarantees. Proceedings of the ACM on Management of Data, Vol. 1, 1 (2023), 1--27.

Digital Library

[21]

Yiming Li, Yanyan Shen, Lei Chen, and Mingxuan Yuan. 2023 b. Zebra: When Temporal Graph Neural Networks Meet Temporal Personalized PageRank. Proceedings of the VLDB Endowment, Vol. 16, 6 (2023), 1332--1345.

[22]

Zhiyuan Li, Xun Jian, Yue Wang, Yingxia Shao, and Lei Chen. 2024. DAHA: Accelerating GNN Training with Data and Hardware Aware Execution Planning., Vol. 17, 6 (2024), 1364--1376.

[23]

Zhiqi Lin, Cheng Li, Youshan Miao, Yunxin Liu, and Yinlong Xu. 2020. Pagraph: Scaling gnn training on large graphs via computation-aware caching. In Proceedings of the 11th ACM Symposium on Cloud Computing. 401--415.

Digital Library

[24]

Silvano Martello and Paolo Toth. 1990. Knapsack problems: algorithms and computer implementations. John Wiley & Sons, Inc.

[25]

Seung Won Min, Kun Wu, Mert Hidayetoglu, Jinjun Xiong, Xiang Song, and Wen-mei Hwu. 2022. Graph neural network training and data tiering. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 3555--3565.

Digital Library

[26]

Seung Won Min, Kun Wu, Sitao Huang, Mert Hidayetoug lu, Jinjun Xiong, Eiman Ebrahimi, Deming Chen, and Wen-mei Hwu. 2021. Large graph convolutional network training with GPU-oriented data communication architecture. Proceedings of the VLDB Endowment, Vol. 14, 11 (2021), 2087--2100.

Digital Library

[27]

Elizabeth J O'neil, Patrick E O'neil, and Gerhard Weikum. 1993. The LRU-K page replacement algorithm for database disk buffering. Acm Sigmod Record, Vol. 22, 2 (1993), 297--306.

Digital Library

[28]

Aldo Pareja, Giacomo Domeniconi, Jie Chen, Tengfei Ma, Toyotaro Suzumura, Hiroki Kanezashi, Tim Kaler, Tao Schardl, and Charles Leiserson. 2020. Evolvegcn: Evolving graph convolutional networks for dynamic graphs. In Proceedings of the AAAI conference on artificial intelligence, Vol. 34. 5363--5370.

[29]

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, Vol. 32 (2019).

[30]

Jingshu Peng, Zhao Chen, Yingxia Shao, Yanyan Shen, Lei Chen, and Jiannong Cao. 2022. Sancus: Staleness-aware communication-avoiding full-graph decentralized training in large-scale graph neural networks. Proceedings of the VLDB Endowment, Vol. 15, 9 (2022), 1937--1950.

Digital Library

[31]

Emanuele Rossi, Ben Chamberlain, Fabrizio Frasca, Davide Eynard, Federico Monti, and Michael Bronstein. 2020. Temporal Graph Networks for Deep Learning on Dynamic Graphs. In ICML 2020 Workshop on Graph Representation Learning.

[32]

Rakshit Trivedi, Mehrdad Farajtabar, Prasenjeet Biswal, and Hongyuan Zha. 2019. Dyrep: Learning representations over dynamic graphs. In International conference on learning representations.

[33]

Petar Velivc ković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2018. Graph Attention Networks. In International Conference on Learning Representations.

[34]

Minjie Yu Wang. 2019. Deep graph library: Towards efficient and scalable deep learning on graphs. In ICLR workshop on representation learning on graphs and manifolds.

[35]

Xuhong Wang, Ding Lyu, Mengjian Li, Yang Xia, Qi Yang, Xinwen Wang, Xinguang Wang, Ping Cui, Yupu Yang, Bowen Sun, et al. 2021b. Apan: Asynchronous propagation attention network for real-time temporal graph embedding. In Proceedings of the 2021 international conference on management of data. 2628--2638.

Digital Library

[36]

Yanbang Wang, Yen-Yu Chang, Yunyu Liu, Jure Leskovec, and Pan Li. 2021a. Inductive Representation Learning in Temporal Networks via Causal Anonymous Walks. In International Conference on Learning Representations.

[37]

Da Xu, Chuanwei Ruan, Evren Korpeoglu, Sushant Kumar, and Kannan Achan. 2020. Inductive representation learning on temporal graphs. In International Conference on Learning Representations.

[38]

Jianbang Yang, Dahai Tang, Xiaoniu Song, Lei Wang, Qiang Yin, Rong Chen, Wenyuan Yu, and Jingren Zhou. 2022. GNNLab: a factored system for sample-based GNN training over GPUs. In Proceedings of the Seventeenth European Conference on Computer Systems. 417--434.

Digital Library

[39]

Xin Zhang, Yanyan Shen, and Lei Chen. 2022. Feature-Oriented Sampling for Fast and Scalable GNN Training. In 2022 IEEE International Conference on Data Mining (ICDM). IEEE, 723--732.

[40]

Xin Zhang, Yanyan Shen, Yingxia Shao, and Lei Chen. 2023. DUCATI: A Dual-Cache Training System for Graph Neural Networks on Giant Graphs with the GPU. Proceedings of the ACM on Management of Data, Vol. 1, 2 (2023), 1--24.

Digital Library

[41]

Chenguang Zheng, Hongzhi Chen, Yuxuan Cheng, Zhezheng Song, Yifan Wu, Changji Li, James Cheng, Hao Yang, and Shuai Zhang. 2022. ByteGNN: efficient graph neural network training at large scale. Proceedings of the VLDB Endowment, Vol. 15, 6 (2022), 1228--1242.

Digital Library

[42]

Hongkuan Zhou, Da Zheng, Israt Nisa, Vasileios Ioannidis, Xiang Song, and George Karypis. 2022. TGL: A General Framework for Temporal GNN Training on Billion-Scale Graphs. Proceedings of the VLDB Endowment, Vol. 15, 8 (2022), 1937--1950.

Digital Library

[43]

Rong Zhu, Kun Zhao, Hongxia Yang, Wei Lin, Chang Zhou, Baole Ai, Yong Li, and Jingren Zhou. 2019. AliGraph: A Comprehensive Graph Neural Network Platform. Proceedings of the VLDB Endowment, Vol. 12, 12 (2019), 2094--2105.

Digital Library

Index Terms

SIMPLE: Efficient Temporal Graph Neural Network Training at Scale with Dynamic Data Placement
1. Information systems
  1. Data management systems

Recommendations

Orca: Scalable Temporal Graph Neural Network Training with Theoretical Guarantees
PACMMOD

Representation learning over dynamic graphs is critical for many real-world applications such as social network services and recommender systems. Temporal graph neural networks (T-GNNs) are powerful representation learning methods and have achieved ...
ETC: Efficient Training of Temporal Graph Neural Networks over Large-Scale Dynamic Graphs

Dynamic graphs play a crucial role in various real-world applications, such as link prediction and node classification on social media and e-commerce platforms. Temporal Graph Neural Networks (T-GNNs) have emerged as a leading approach for handling ...
Rethinking graph data placement for graph neural network training on multiple GPUs
ICS '22: Proceedings of the 36th ACM International Conference on Supercomputing

Graph partitioning is commonly used for dividing graph data for parallel processing. While they achieve good performance for the traditional graph processing algorithms, the existing graph partitioning methods are unsatisfactory for data-parallel GNN ...

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Management of Data

Proceedings of the ACM on Management of Data Volume 2, Issue 3

SIGMOD

June 2024

1953 pages

EISSN:2836-6573

DOI:10.1145/3670010

Editor:
Divyakant Agrawal
UC Santa Barbara, United States

Issue’s Table of Contents

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 May 2024

Published in PACMMOD Volume 2, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
159
Total Downloads

Downloads (Last 12 months)159
Downloads (Last 6 weeks)44

Reflects downloads up to 30 Aug 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents