Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

ETC: Efficient Training of Temporal Graph Neural Networks over Large-Scale Dynamic Graphs

Published: 02 May 2024 Publication History

Abstract

Dynamic graphs play a crucial role in various real-world applications, such as link prediction and node classification on social media and e-commerce platforms. Temporal Graph Neural Networks (T-GNNs) have emerged as a leading approach for handling dynamic graphs, using temporal message passing to compute temporal node embeddings. However, training existing T-GNNs on large-scale dynamic graphs is prohibitively expensive due to the ill-suited batching scheme and significant data access overhead. In this paper, we introduce ETC, a generic framework designed specifically for efficient T-GNN training at scale. ETC incorporates a novel data batching scheme that enables large training batches improving model computation efficiency, while preserving model effectiveness by restricting information loss in each training batch. To reduce data access overhead, ETC employs a three-step data access policy that leverages the data access pattern in T-GNN training, significantly reducing redundant data access volume. Additionally, ETC utilizes an inter-batch pipeline mechanism, decoupling data access from model computation and further reducing data access costs. Extensive experimental results demonstrate the effectiveness of ETC, showcasing its ability to achieve significant training speedups compared to state-of-the-art training frameworks for T-GNNs on real-world dynamic graphs with millions of interactions. ETC provides a training speedup ranging from 1.6X to 62.4X, highlighting its potential for efficient training on large-scale dynamic graphs.

References

[1]
[2023]. Stack-Overflow. https://snap.stanford.edu/data/sx-stackoverflow.html
[2]
[2023]. The technical report. https://github.com/eddiegaoo/ETC/blob/main/Technical-report.pdf
[3]
[2023]. Wiki-Talk. http://snap.stanford.edu/data/wiki-talk-temporal.html
[4]
Korte Bernhard and Jens Vygen. 2008. Combinatorial optimization: Theory and algorithms. Springer, Third Edition, 2005. (2008).
[5]
Venkatesan T Chakaravarthy, Shivmaran S Pandian, Saurabh Raje, Yogish Sabharwal, Toyotaro Suzumura, and Shashanka Ubaru. 2021. Efficient scaling of dynamic graph neural networks. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 1--15.
[6]
Weilin Cong, Si Zhang, Jian Kang, Baichuan Yuan, Hao Wu, Xin Zhou, Hanghang Tong, and Mehrdad Mahdavi. 2023. Do We Really Need Complicated Model Architectures For Temporal Networks?. In International Conference on Learning Representations.
[7]
Jialin Dong, Da Zheng, Lin F Yang, and George Karypis. 2021. Global neighbor sampling for mixed CPU-GPU training on giant graphs. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 289--299.
[8]
Palash Goyal, Sujit Rokka Chhetri, and Arquimedes Canedo. 2020. dyngraph2vec: Capturing network dynamics using dynamic graph representation learning. Knowledge-Based Systems 187 (2020), 104816.
[9]
Mingyu Guan, Anand Padmanabha Iyer, and Taesoo Kim. 2022. DynaGraph: dynamic graph neural networks at scale. In Proceedings of the 5th ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA). 1--10.
[10]
Ehsan Hajiramezanali, Arman Hasanzadeh, Krishna Narayanan, Nick Duffield, Mingyuan Zhou, and Xiaoning Qian. 2019. Variational graph recurrent neural networks. Advances in neural information processing systems 32 (2019).
[11]
Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. Advances in neural information processing systems 30 (2017).
[12]
Charles R Harris, K Jarrod Millman, Stéfan J Van Der Walt, Ralf Gommers, Pauli Virtanen, David Cournapeau, Eric Wieser, Julian Taylor, Sebastian Berg, Nathaniel J Smith, et al. 2020. Array programming with NumPy. Nature 585, 7825 (2020), 357--362.
[13]
Ming Jin, Yuan-Fang Li, and Shirui Pan. 2022. Neural Temporal Walks: Motif-Aware Representation Learning on Continuous-Time Dynamic Graphs. In Advances in Neural Information Processing Systems.
[14]
Thomas N Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In International Conference on Learning Representations.
[15]
Srijan Kumar, Xikun Zhang, and Jure Leskovec. 2019. Predicting dynamic embedding trajectory in temporal interaction networks. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining. 1269--1278.
[16]
Kalev Leetaru and Philip A Schrodt. 2013. Gdelt: Global data on events, location, and tone, 1979--2012. In ISA annual convention, Vol. 2. Citeseer, 1--49.
[17]
Haoyang Li and Lei Chen. 2021. Cache-based gnn system for dynamic graphs. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 937--946.
[18]
Yiming Li, Yanyan Shen, Lei Chen, and Mingxuan Yuan. 2023. Orca: Scalable Temporal Graph Neural Network Training with Theoretical Guarantees. Proceedings of the ACM on Management of Data 1, 1 (2023), 1--27.
[19]
Yiming Li, Yanyan Shen, Lei Chen, and Mingxuan Yuan. 2023. Zebra: When Temporal Graph Neural Networks Meet Temporal Personalized PageRank. Proceedings of the VLDB Endowment 16, 6 (2023), 1332--1345.
[20]
Zhiqi Lin, Cheng Li, Youshan Miao, Yunxin Liu, and Yinlong Xu. 2020. Pa-graph: Scaling gnn training on large graphs via computation-aware caching. In Proceedings of the 11th ACM Symposium on Cloud Computing. 401--415.
[21]
Xupeng Miao, Yining Shi, Hailin Zhang, Xin Zhang, Xiaonan Nie, Zhi Yang, and Bin Cui. 2022. HET-GMP: a graph-based system approach to scaling large embedding model training. In Proceedings of the 2022 International Conference on Management of Data. 470--480.
[22]
Seung Won Min, Kun Wu, Sitao Huang, Mert Hidayetoğlu, Jinjun Xiong, Eiman Ebrahimi, Deming Chen, and Wen-mei Hwu. 2021. Large graph convolutional network training with GPU-oriented data communication architecture. Proceedings of the VLDB Endowment 14, 11 (2021), 2087--2100.
[23]
Aldo Pareja, Giacomo Domeniconi, Jie Chen, Tengfei Ma, Toyotaro Suzumura, Hiroki Kanezashi, Tim Kaler, Tao Schardl, and Charles Leiserson. 2020. Evolvegcn: Evolving graph convolutional networks for dynamic graphs. In Proceedings of the AAAI conference on artificial intelligence, Vol. 34. 5363--5370.
[24]
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019).
[25]
Jingshu Peng, Zhao Chen, Yingxia Shao, Yanyan Shen, Lei Chen, and Jiannong Cao. 2022. Sancus: Staleness-aware communication-avoiding full-graph decentralized training in large-scale graph neural networks. Proceedings of the VLDB Endowment 15, 9 (2022), 1937--1950.
[26]
Yu Rong, Wenbing Huang, Tingyang Xu, and Junzhou Huang. 2020. DropEdge: Towards Deep Graph Convolutional Networks on Node Classification. In International Conference on Learning Representations.
[27]
Emanuele Rossi, Ben Chamberlain, Fabrizio Frasca, Davide Eynard, Federico Monti, and Michael Bronstein. 2020. Temporal Graph Networks for Deep Learning on Dynamic Graphs. In ICML 2020 Workshop on Graph Representation Learning.
[28]
Rakshit Trivedi, Mehrdad Farajtabar, Prasenjeet Biswal, and Hongyuan Zha. 2019. Dyrep: Learning representations over dynamic graphs. In International conference on learning representations.
[29]
Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2018. Graph Attention Networks. In International Conference on Learning Representations.
[30]
Minjie Yu Wang. 2019. Deep graph library: Towards efficient and scalable deep learning on graphs. In ICLR workshop on representation learning on graphs and manifolds.
[31]
Xuhong Wang, Ding Lyu, Mengjian Li, Yang Xia, Qi Yang, Xinwen Wang, Xinguang Wang, Ping Cui, Yupu Yang, Bowen Sun, et al. 2021. Apan: Asynchronous propagation attention network for real-time temporal graph embedding. In Proceedings of the 2021 international conference on management of data. 2628--2638.
[32]
Yanbang Wang, Yen-Yu Chang, Yunyu Liu, Jure Leskovec, and Pan Li. 2021. Inductive Representation Learning in Temporal Networks via Causal Anonymous Walks. In International Conference on Learning Representations.
[33]
Da Xu, Chuanwei Ruan, Evren Korpeoglu, Sushant Kumar, and Kannan Achan. 2020. Inductive representation learning on temporal graphs. In International Conference on Learning Representations.
[34]
Jianbang Yang, Dahai Tang, Xiaoniu Song, Lei Wang, Qiang Yin, Rong Chen, Wenyuan Yu, and Jingren Zhou. 2022. GNNlab: a factored system for sample-based GNN training over GPUs. In Proceedings of the Seventeenth European Conference on Computer Systems. 417--434.
[35]
Hanqing Zeng, Hongkuan Zhou, Ajitesh Srivastava, Rajgopal Kannan, and Viktor Prasanna. 2020. GraphSAINT: Graph Sampling Based Inductive Learning Method. In International Conference on Learning Representations.
[36]
Xin Zhang, Yanyan Shen, and Lei Chen. 2022. Feature-Oriented Sampling for Fast and Scalable GNN Training. In 2022 IEEE International Conference on Data Mining (ICDM). IEEE, 723--732.
[37]
Xin Zhang, Yanyan Shen, Yingxia Shao, and Lei Chen. 2023. DUCATI: A Dual-Cache Training System for Graph Neural Networks on Giant Graphs with the GPU. Proceedings of the ACM on Management of Data 1, 2 (2023), 1--24.
[38]
Chenguang Zheng, Hongzhi Chen, Yuxuan Cheng, Zhezheng Song, Yifan Wu, Changji Li, James Cheng, Hao Yang, and Shuai Zhang. 2022. ByteGNN: efficient graph neural network training at large scale. Proceedings of the VLDB Endowment 15, 6 (2022), 1228--1242.
[39]
Hongkuan Zhou, Da Zheng, Israt Nisa, Vasileios Ioannidis, Xiang Song, and George Karypis. 2022. TGL: a general framework for temporal GNN training on billion-scale graphs. Proceedings of the VLDB Endowment 15, 8 (2022), 1572--1580.
[40]
Rong Zhu, Kun Zhao, Hongxia Yang, Wei Lin, Chang Zhou, Baole Ai, Yong Li, and Jingren Zhou. 2019. AliGraph: A Comprehensive Graph Neural Network Platform. Proceedings of the VLDB Endowment 12, 12 (2019), 2094--2105.

Cited By

View all
  • (2024)Efficient Training of Graph Neural Networks on Large GraphsProceedings of the VLDB Endowment10.14778/3685800.368584417:12(4237-4240)Online publication date: 8-Nov-2024
  • (2024)Fight Fire with Fire: Towards Robust Graph Neural Networks on Dynamic Graphs via Actively DefenseProceedings of the VLDB Endowment10.14778/3659437.365945717:8(2050-2063)Online publication date: 31-May-2024
  • (2024)SIMPLE: Efficient Temporal Graph Neural Network Training at Scale with Dynamic Data PlacementProceedings of the ACM on Management of Data10.1145/36549772:3(1-25)Online publication date: 30-May-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 17, Issue 5
January 2024
233 pages
Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 02 May 2024
Published in PVLDB Volume 17, Issue 5

Check for updates

Badges

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)243
  • Downloads (Last 6 weeks)38
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Efficient Training of Graph Neural Networks on Large GraphsProceedings of the VLDB Endowment10.14778/3685800.368584417:12(4237-4240)Online publication date: 8-Nov-2024
  • (2024)Fight Fire with Fire: Towards Robust Graph Neural Networks on Dynamic Graphs via Actively DefenseProceedings of the VLDB Endowment10.14778/3659437.365945717:8(2050-2063)Online publication date: 31-May-2024
  • (2024)SIMPLE: Efficient Temporal Graph Neural Network Training at Scale with Dynamic Data PlacementProceedings of the ACM on Management of Data10.1145/36549772:3(1-25)Online publication date: 30-May-2024
  • (2024)Towards Efficient Temporal Graph Learning: Algorithms, Frameworks, and ToolsProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679104(5530-5533)Online publication date: 21-Oct-2024
  • (2024)Dynamic Neighborhood Selection for Context Aware Temporal Evolution Using Graph Neural NetworksCognitive Computation10.1007/s12559-024-10359-017:1Online publication date: 5-Dec-2024

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media