Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3447548.3467437acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Open access

Global Neighbor Sampling for Mixed CPU-GPU Training on Giant Graphs

Published: 14 August 2021 Publication History

Abstract

Graph neural networks (GNNs) are powerful tools for learning from graph data and are widely used in various applications such as social network recommendation, fraud detection, and graph search. The graphs in these applications are typically large, usually containing hundreds of millions of nodes. Training GNN models on such large graphs efficiently remains a big challenge. Despite a number of sampling-based methods have been proposed to enable mini-batch training on large graphs, these methods have not been proved to work on truly industry-scale graphs, which require GPUs or mixed CPU-GPU training. The state-of-the-art sampling-based methods are usually not optimized for these real-world hardware setups, in which data movement between CPUs and GPUs is a bottleneck. To address this issue, we propose Global Neighborhood Sampling that aims at training GNNs on giant graphs specifically for mixed CPU-GPU training. The algorithm samples a global cache of nodes periodically for all mini-batches and stores them in GPUs. This global cache allows in-GPU importance sampling of mini-batches, which drastically reduces the number of nodes in a mini-batch, especially in the input layer, to reduce data copy between CPU and GPU and mini-batch computation without compromising the training convergence rate or model accuracy. We provide a highly efficient implementation of this method and show that our implementation outperforms an efficient node-wise neighbor sampling baseline by a factor of 2× ~ 4× on giant graphs. It outperforms an efficient implementation of LADIES with small layers by a factor of 2× ~ 14× while achieving much higher accuracy than LADIES. We also theoretically analyze the proposed algorithm and show that with cached node data of a proper size, it enjoys a comparable convergence rate as the underlying node-wise sampling method.

Supplementary Material

MP4 File (global_neighbor_sampling_for_mixed-jialin_dong-da_zheng-38958011-vIDm.mp4)
Presentation video

References

[1]
Jie Chen, Tengfei Ma, and Cao Xiao. 2018. FastGCN: Fast Learning with Graph Convolutional Networks via Importance Sampling. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. https://openreview.net/forum?id=rytstxWAW
[2]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding., Vol. abs/1810.04805 (2019).
[3]
Matthias Fey and Jan Eric Lenssen. 2019. Fast Graph Representation Learning with PyTorch Geometric. CoRR, Vol. abs/1903.02428 (2019).
[4]
William L. Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive Representation Learning on Large Graphs. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4--9 December 2017, Long Beach, CA, USA. 1025--1035. http://papers.nips.cc/paper/6703-inductive-representation-learning-on-large-graphs
[5]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation, Vol. 9, 8 (1997), 1735--1780.
[6]
Weihua Hu, Matthias Fey, Marinka Zitnik, Yuxiao Dong, Hongyu Ren, Bowen Liu, Michele Catasta, and Jure Leskovec. 2020. Open graph benchmark: Datasets for machine learning on graphs. arXiv preprint arXiv:2005.00687 (2020).
[7]
Thomas N. Kipf and Max Welling. [n.d.]. Semi-Supervised Classification with Graph Convolutional Networks. In 5th International Conference on Learning Representations (ICLR-17).
[8]
Jonas Moritz Kohler and Aurelien Lucchi. 2017. Sub-sampled Cubic Regularization for Non-convex Optimization. In International Conference on Machine Learning. 1895--1904.
[9]
Ziqi Liu, Zhengwei Wu, Zhiqiang Zhang, Jun Zhou, Shuang Yang, Le Song, and Yuan Qi. 2020. Bandit Samplers for Training Graph Neural Networks., Vol. abs/2006.05806 (2020).
[10]
Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. 2017. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 652--660.
[11]
Morteza Ramezani, Weilin Cong, Mehrdad Mahdavi, Anand Sivasubramaniam, and Mahmut Kandemir. 2020. GCN meets GPU: Decoupling ?When to Sample" from "How to Sample". Advances in Neural Information Processing Systems, Vol. 33 (2020).
[12]
Jie Tang, Jing Zhang, Limin Yao, Juanzi Li, Li Zhang, and Zhong Su. 2008. ArnetMiner: Extraction and Mining of Academic Social Networks. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Las Vegas, Nevada, USA). New York, NY, USA.
[13]
Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. [n.d.]. Graph Attention Networks. In 6th International Conference on Learning Representations (ICLR-18).
[14]
Minjie Wang, Da Zheng, Zihao Ye, Quan Gan, Mufei Li, Xiang Song, Jinjing Zhou, Chao Ma, Lingfan Yu, Yu Gai, Tianjun Xiao, Tong He, George Karypis, Jinyang Li, and Zheng Zhang. 2019. Deep Graph Library: A Graph-Centric, Highly-Performant Package for Graph Neural Networks. arXiv preprint arXiv:1909.01315 (2019).
[15]
Hanqing Zeng, Hongkuan Zhou, Ajitesh Srivastava, Rajgopal Kannan, and Viktor Prasanna. 2019. Graphsaint: Graph sampling based inductive learning method. arXiv preprint arXiv:1907.04931 (2019).
[16]
Da Zheng, Chao Ma, Minjie Wang, Jinjing Zhou, Qidong Su, Xiang Song, Quan Gan, Zheng Zhang, and George Karypis. 2020. DistDGL: Distributed Graph Neural Network Training for Billion-Scale Graphs. arXiv preprint arXiv:2010.05337 (2020).
[17]
Difan Zou, Ziniu Hu, Yewen Wang, Song Jiang, Yizhou Sun, and Quanquan Gu. 2019. Layer-dependent importance sampling for training deep and large graph convolutional networks. In Advances in Neural Information Processing Systems. 11249--11259.

Cited By

View all
  • (2024)Eliminating Data Processing Bottlenecks in GNN Training over Large Graphs via Two-level Feature CompressionProceedings of the VLDB Endowment10.14778/3681954.368196817:11(2854-2866)Online publication date: 1-Jul-2024
  • (2024)ETC: Efficient Training of Temporal Graph Neural Networks over Large-Scale Dynamic GraphsProceedings of the VLDB Endowment10.14778/3641204.364121517:5(1060-1072)Online publication date: 2-May-2024
  • (2024)Distributed Graph Neural Network Training: A SurveyACM Computing Surveys10.1145/364835856:8(1-39)Online publication date: 10-Apr-2024
  • Show More Cited By

Index Terms

  1. Global Neighbor Sampling for Mixed CPU-GPU Training on Giant Graphs

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining
    August 2021
    4259 pages
    ISBN:9781450383325
    DOI:10.1145/3447548
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 14 August 2021

    Check for updates

    Author Tags

    1. graph neural networks
    2. mixed CPU-GPU training
    3. neighbor sampling

    Qualifiers

    • Research-article

    Conference

    KDD '21
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Upcoming Conference

    KDD '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)212
    • Downloads (Last 6 weeks)22
    Reflects downloads up to 17 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Eliminating Data Processing Bottlenecks in GNN Training over Large Graphs via Two-level Feature CompressionProceedings of the VLDB Endowment10.14778/3681954.368196817:11(2854-2866)Online publication date: 1-Jul-2024
    • (2024)ETC: Efficient Training of Temporal Graph Neural Networks over Large-Scale Dynamic GraphsProceedings of the VLDB Endowment10.14778/3641204.364121517:5(1060-1072)Online publication date: 2-May-2024
    • (2024)Distributed Graph Neural Network Training: A SurveyACM Computing Surveys10.1145/364835856:8(1-39)Online publication date: 10-Apr-2024
    • (2024)Scaling New Heights: Transformative Cross-GPU Sampling for Training Billion-Edge GraphsSC24: International Conference for High Performance Computing, Networking, Storage and Analysis10.1109/SC41406.2024.00056(1-15)Online publication date: 17-Nov-2024
    • (2024)TASER: Temporal Adaptive Sampling for Fast and Accurate Dynamic Graph Representation Learning2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS57955.2024.00087(926-937)Online publication date: 27-May-2024
    • (2024)ECHO: Adaptive Correction for Subgraph-Wise Sampling with Lightweight Hyperparameter SearchWeb and Big Data10.1007/978-981-97-7244-5_12(179-194)Online publication date: 28-Aug-2024
    • (2023)Layer-neighbor sampling — defusing neighborhood explosion in GNNsProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3667245(25819-25836)Online publication date: 10-Dec-2023
    • (2023)A Comprehensive Survey on Distributed Training of Graph Neural NetworksProceedings of the IEEE10.1109/JPROC.2023.3337442111:12(1572-1606)Online publication date: Dec-2023
    • (2023)Communication Optimization for Distributed Execution of Graph Neural Networks2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS54959.2023.00058(512-523)Online publication date: May-2023
    • (2023)Two-level Graph Caching for Expediting Distributed GNN TrainingIEEE INFOCOM 2023 - IEEE Conference on Computer Communications10.1109/INFOCOM53939.2023.10228911(1-10)Online publication date: 17-May-2023
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media