Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

SIMPLE: Efficient Temporal Graph Neural Network Training at Scale with Dynamic Data Placement

Published: 30 May 2024 Publication History

Abstract

Dynamic graphs are essential in real-world scenarios like social media and e-commerce for tasks such as predicting links and classifying nodes. Temporal Graph Neural Networks (T-GNNs) stand out as a prime solution for managing dynamic graphs, employing temporal message passing to compute node embeddings at specific timestamps. Nonetheless, the high CPU-GPU data loading overhead has become the bottleneck for efficient training of T-GNNs over large-scale dynamic graphs. In this work, we present SIMPLE, a versatile system designed to address the major efficiency bottleneck in training existing T-GNNs on a large scale. It incorporates a dynamic data placement mechanism, which maintains a small buffer space in available GPU memory and dynamically manages its content during T-GNN training. SIMPLE is also empowered by systematic optimizations towards data processing flow. We compare SIMPLE to the state-of-the-art generic T-GNN training system TGL on four large-scale dynamic graphs with different underlying T-GNN models. Extensive experimental results show that SIMPLE effectively cuts down 80.5% ~ 96.8% data loading cost, and accelerates T-GNN training by 1.8× ~ 3.8× (2.6× on average) compared to TGL.

References

[1]
[2023]. Stack-Overflow. https://snap.stanford.edu/data/sx-stackoverflow.html
[2]
[2023]. Wiki-Talk. http://snap.stanford.edu/data/wiki-talk-temporal.html
[3]
Valentina Cacchiani, Manuel Iori, Alberto Locatelli, and Silvano Martello. 2022. Knapsack problems-An overview of recent advances. Part II: Multiple, multidimensional, and quadratic knapsack problems. Computers & Operations Research, Vol. 143 (2022), 105693.
[4]
Venkatesan T Chakaravarthy, Shivmaran S Pandian, Saurabh Raje, Yogish Sabharwal, Toyotaro Suzumura, and Shashanka Ubaru. 2021. Efficient scaling of dynamic graph neural networks. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 1--15.
[5]
Weilin Cong, Si Zhang, Jian Kang, Baichuan Yuan, Hao Wu, Xin Zhou, Hanghang Tong, and Mehrdad Mahdavi. 2023. Do We Really Need Complicated Model Architectures For Temporal Networks?. In International Conference on Learning Representations.
[6]
Arnaud Fréville. 2004. The multidimensional 0--1 knapsack problem: An overview. European Journal of Operational Research, Vol. 155, 1 (2004), 1--21.
[7]
Shihong Gao, Yiming Li, Yanyan Shen, Yingxia Shao, and Lei Chen. 2024. ETC: Efficient Training of Temporal Graph Neural Networks over Large-scale Dynamic Graphs. Proceedings of the VLDB Endowment, Vol. 17, 5 (2024), 1060--1072.
[8]
Palash Goyal, Sujit Rokka Chhetri, and Arquimedes Canedo. 2020. dyngraph2vec: Capturing network dynamics using dynamic graph representation learning. Knowledge-Based Systems, Vol. 187 (2020), 104816.
[9]
Mingyu Guan, Anand Padmanabha Iyer, and Taesoo Kim. 2022. DynaGraph: dynamic graph neural networks at scale. In Proceedings of the 5th ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA). 1--10.
[10]
Ehsan Hajiramezanali, Arman Hasanzadeh, Krishna Narayanan, Nick Duffield, Mingyuan Zhou, and Xiaoning Qian. 2019. Variational graph recurrent neural networks. Advances in neural information processing systems, Vol. 32 (2019).
[11]
Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. Advances in neural information processing systems, Vol. 30 (2017).
[12]
Ming Jin, Yuan-Fang Li, and Shirui Pan. 2022. Neural Temporal Walks: Motif-Aware Representation Learning on Continuous-Time Dynamic Graphs. In Advances in Neural Information Processing Systems.
[13]
Thomas N Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In International Conference on Learning Representations.
[14]
Srijan Kumar, Xikun Zhang, and Jure Leskovec. 2019. Predicting dynamic embedding trajectory in temporal interaction networks. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining. 1269--1278.
[15]
Siu Kwan Lam, Antoine Pitrou, and Stanley Seibert. 2015. Numba: A llvm-based python jit compiler. In Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC. 1--6.
[16]
Donghee Lee, Jongmoo Choi, Jong-Hun Kim, Sam H Noh, Sang Lyul Min, Yookun Cho, and Chong Sang Kim. 2001. LRFU: A spectrum of policies that subsumes the least recently used and least frequently used policies. IEEE transactions on Computers, Vol. 50, 12 (2001), 1352--1361.
[17]
Kalev Leetaru and Philip A Schrodt. 2013. Gdelt: Global data on events, location, and tone, 1979--2012. In ISA annual convention, Vol. 2. Citeseer, 1--49.
[18]
Guoliang Li, Xuanhe Zhou, and Lei Cao. 2021. AI meets database: AI4DB and DB4AI. In Proceedings of the 2021 International Conference on Management of Data. 2859--2866.
[19]
Haoyang Li and Lei Chen. 2021. Cache-based gnn system for dynamic graphs. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 937--946.
[20]
Yiming Li, Yanyan Shen, Lei Chen, and Mingxuan Yuan. 2023 a. Orca: Scalable Temporal Graph Neural Network Training with Theoretical Guarantees. Proceedings of the ACM on Management of Data, Vol. 1, 1 (2023), 1--27.
[21]
Yiming Li, Yanyan Shen, Lei Chen, and Mingxuan Yuan. 2023 b. Zebra: When Temporal Graph Neural Networks Meet Temporal Personalized PageRank. Proceedings of the VLDB Endowment, Vol. 16, 6 (2023), 1332--1345.
[22]
Zhiyuan Li, Xun Jian, Yue Wang, Yingxia Shao, and Lei Chen. 2024. DAHA: Accelerating GNN Training with Data and Hardware Aware Execution Planning., Vol. 17, 6 (2024), 1364--1376.
[23]
Zhiqi Lin, Cheng Li, Youshan Miao, Yunxin Liu, and Yinlong Xu. 2020. Pagraph: Scaling gnn training on large graphs via computation-aware caching. In Proceedings of the 11th ACM Symposium on Cloud Computing. 401--415.
[24]
Silvano Martello and Paolo Toth. 1990. Knapsack problems: algorithms and computer implementations. John Wiley & Sons, Inc.
[25]
Seung Won Min, Kun Wu, Mert Hidayetoglu, Jinjun Xiong, Xiang Song, and Wen-mei Hwu. 2022. Graph neural network training and data tiering. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 3555--3565.
[26]
Seung Won Min, Kun Wu, Sitao Huang, Mert Hidayetoug lu, Jinjun Xiong, Eiman Ebrahimi, Deming Chen, and Wen-mei Hwu. 2021. Large graph convolutional network training with GPU-oriented data communication architecture. Proceedings of the VLDB Endowment, Vol. 14, 11 (2021), 2087--2100.
[27]
Elizabeth J O'neil, Patrick E O'neil, and Gerhard Weikum. 1993. The LRU-K page replacement algorithm for database disk buffering. Acm Sigmod Record, Vol. 22, 2 (1993), 297--306.
[28]
Aldo Pareja, Giacomo Domeniconi, Jie Chen, Tengfei Ma, Toyotaro Suzumura, Hiroki Kanezashi, Tim Kaler, Tao Schardl, and Charles Leiserson. 2020. Evolvegcn: Evolving graph convolutional networks for dynamic graphs. In Proceedings of the AAAI conference on artificial intelligence, Vol. 34. 5363--5370.
[29]
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, Vol. 32 (2019).
[30]
Jingshu Peng, Zhao Chen, Yingxia Shao, Yanyan Shen, Lei Chen, and Jiannong Cao. 2022. Sancus: Staleness-aware communication-avoiding full-graph decentralized training in large-scale graph neural networks. Proceedings of the VLDB Endowment, Vol. 15, 9 (2022), 1937--1950.
[31]
Emanuele Rossi, Ben Chamberlain, Fabrizio Frasca, Davide Eynard, Federico Monti, and Michael Bronstein. 2020. Temporal Graph Networks for Deep Learning on Dynamic Graphs. In ICML 2020 Workshop on Graph Representation Learning.
[32]
Rakshit Trivedi, Mehrdad Farajtabar, Prasenjeet Biswal, and Hongyuan Zha. 2019. Dyrep: Learning representations over dynamic graphs. In International conference on learning representations.
[33]
Petar Velivc ković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2018. Graph Attention Networks. In International Conference on Learning Representations.
[34]
Minjie Yu Wang. 2019. Deep graph library: Towards efficient and scalable deep learning on graphs. In ICLR workshop on representation learning on graphs and manifolds.
[35]
Xuhong Wang, Ding Lyu, Mengjian Li, Yang Xia, Qi Yang, Xinwen Wang, Xinguang Wang, Ping Cui, Yupu Yang, Bowen Sun, et al. 2021b. Apan: Asynchronous propagation attention network for real-time temporal graph embedding. In Proceedings of the 2021 international conference on management of data. 2628--2638.
[36]
Yanbang Wang, Yen-Yu Chang, Yunyu Liu, Jure Leskovec, and Pan Li. 2021a. Inductive Representation Learning in Temporal Networks via Causal Anonymous Walks. In International Conference on Learning Representations.
[37]
Da Xu, Chuanwei Ruan, Evren Korpeoglu, Sushant Kumar, and Kannan Achan. 2020. Inductive representation learning on temporal graphs. In International Conference on Learning Representations.
[38]
Jianbang Yang, Dahai Tang, Xiaoniu Song, Lei Wang, Qiang Yin, Rong Chen, Wenyuan Yu, and Jingren Zhou. 2022. GNNLab: a factored system for sample-based GNN training over GPUs. In Proceedings of the Seventeenth European Conference on Computer Systems. 417--434.
[39]
Xin Zhang, Yanyan Shen, and Lei Chen. 2022. Feature-Oriented Sampling for Fast and Scalable GNN Training. In 2022 IEEE International Conference on Data Mining (ICDM). IEEE, 723--732.
[40]
Xin Zhang, Yanyan Shen, Yingxia Shao, and Lei Chen. 2023. DUCATI: A Dual-Cache Training System for Graph Neural Networks on Giant Graphs with the GPU. Proceedings of the ACM on Management of Data, Vol. 1, 2 (2023), 1--24.
[41]
Chenguang Zheng, Hongzhi Chen, Yuxuan Cheng, Zhezheng Song, Yifan Wu, Changji Li, James Cheng, Hao Yang, and Shuai Zhang. 2022. ByteGNN: efficient graph neural network training at large scale. Proceedings of the VLDB Endowment, Vol. 15, 6 (2022), 1228--1242.
[42]
Hongkuan Zhou, Da Zheng, Israt Nisa, Vasileios Ioannidis, Xiang Song, and George Karypis. 2022. TGL: A General Framework for Temporal GNN Training on Billion-Scale Graphs. Proceedings of the VLDB Endowment, Vol. 15, 8 (2022), 1937--1950.
[43]
Rong Zhu, Kun Zhao, Hongxia Yang, Wei Lin, Chang Zhou, Baole Ai, Yong Li, and Jingren Zhou. 2019. AliGraph: A Comprehensive Graph Neural Network Platform. Proceedings of the VLDB Endowment, Vol. 12, 12 (2019), 2094--2105.

Index Terms

  1. SIMPLE: Efficient Temporal Graph Neural Network Training at Scale with Dynamic Data Placement

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the ACM on Management of Data
    Proceedings of the ACM on Management of Data  Volume 2, Issue 3
    SIGMOD
    June 2024
    1953 pages
    EISSN:2836-6573
    DOI:10.1145/3670010
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 30 May 2024
    Published in PACMMOD Volume 2, Issue 3

    Permissions

    Request permissions for this article.

    Author Tags

    1. dynamic data placement
    2. large-scale dynamic graphs
    3. temporal graph neural networks

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 159
      Total Downloads
    • Downloads (Last 12 months)159
    • Downloads (Last 6 weeks)44
    Reflects downloads up to 30 Aug 2024

    Other Metrics

    Citations

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media