Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3485447.3512165acmconferencesArticle/Chapter ViewAbstractPublication PageswebconfConference Proceedingsconference-collections
research-article

Resource-Efficient Training for Large Graph Convolutional Networks with Label-Centric Cumulative Sampling

Published: 25 April 2022 Publication History
  • Get Citation Alerts
  • Abstract

    Graph Convolutional Networks (GCNs) are popular for learning representation of graph data and have a wide range of applications in social networks, recommendation systems, etc. However, training GCN models for large networks is resource intensive and time consuming, which hinders them from real deployment. The existing GCN training methods intended to optimize the sampling of mini-batches for stochastic gradient descent to accelerate training process, which did not reduce the problem size and had limited reduction in computation complexity. In this paper, we argue that a GCN can be trained with a sampled subgraph to produce approximate node representations, which inspires us a novel perspective to accelerate GCN training via network sampling. To this end, we propose a label-centric cumulative sampling (LCS) framework for training GCNs for large graphs. The proposed method constructs a subgraph cumulatively based on probabilistic sampling, and trains the GCN model iteratively to generate approximate node representations. The optimality of LCS is theoretically guaranteed to minimize the bias during node aggregation procedure in GCN training. Extensive experiments based on four real-world network datasets show that the LCS framework accelerates the training for the state-of-the-art GCN models up to 17x without causing noteworthy model accuracy drop.

    References

    [1]
    Charu C Aggarwal, Xiangnan Kong, Quanquan Gu, Jiawei Han, and S Yu Philip. 2014. Active learning: A survey. In Data Classification: Algorithms and Applications. CRC Press, 571–605.
    [2]
    Nesreen K Ahmed, Jennifer Neville, and Ramana Kompella. 2014. Network sampling: From static to streaming graphs. ACM Transactions on Knowledge Discovery from Data (TKDD 2014) 8, 2(2014), 7.
    [3]
    Deli Chen, Yankai Lin, Wei Li, Peng Li, Jie Zhou, and Xu Sun. 2020. Measuring and relieving the over-smoothing problem for graph neural networks from the topological view. In Proceedings of the AAAI Conference on Artificial Intelligence(AAAI 2020), Vol. 34. 3438–3445.
    [4]
    Jie Chen, Tengfei Ma, and Cao Xiao. 2018. FastGCN: Fast Learning with Graph Convolutional Networks via Importance Sampling. In International Conference on Learning Representations (ICLR 2018).
    [5]
    Jianfei Chen, Jun Zhu, and Le Song. 2018. Stochastic training of graph convolutional networks with variance reduction. In Proceedings of the 35th International Conference on Machine Learning (ICML 2018). 1–9.
    [6]
    Wei-Lin Chiang, Xuanqing Liu, Si Si, Yang Li, Samy Bengio, and Cho-Jui Hsieh. 2019. Cluster-GCN: An efficient algorithm for training deep and large graph convolutional networks. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD 2019). 257–266.
    [7]
    Weilin Cong, Rana Forsati, Mahmut Kandemir, and Mehrdad Mahdavi. 2020. Minimal variance sampling with provable guarantees for fast training of graph neural networks. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD 2020). 1393–1403.
    [8]
    Songgaojun Deng, Huzefa Rangwala, and Yue Ning. 2019. Learning Dynamic Context Graphs for Predicting Social Events. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD 2019). 1007–1016.
    [9]
    Minas Gjoka, Maciej Kurant, Carter T Butts, and Athina Markopoulou. 2010. Walking in facebook: A case study of unbiased sampling of osns. In 2010 Proceedings IEEE International Conference on Computer Communications (INFOCOM 2010). Ieee, 1–9.
    [10]
    Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In Advances in neural information processing systems (NIPS 2017). 1024–1034.
    [11]
    Godfrey Harold Hardy, John Edensor Littlewood, and George Pólya. 1952. Inequalities. By GH Hardy, JE Littlewood, G. Pólya.University Press.
    [12]
    Daniel G Horvitz and Donovan J Thompson. 1952. A generalization of sampling without replacement from a finite universe. Journal of the American statistical Association 47, 260(1952), 663–685.
    [13]
    Yifan Hou, Jian Zhang, James Cheng, Kaili Ma, Richard TB Ma, Hongzhi Chen, and Ming-Chang Yang. 2020. Measuring and improving the use of graph information in graph neural networks. In International Conference on Learning Representations(ICLR 2020).
    [14]
    Wenbing Huang, Tong Zhang, Yu Rong, and Junzhou Huang. 2018. Adaptive Sampling Towards Fast Graph Representation Learning. Advances in Neural Information Processing Systems 31 (2018), 4558–4567.
    [15]
    Long Jin, Yang Chen, Pan Hui, Cong Ding, Tianyi Wang, Athanasios V Vasilakos, Beixing Deng, and Xing Li. 2011. Albatross sampling: robust and effective hybrid vertex sampling for social graphs. In Proceedings of the 3rd ACM international workshop on MobiArch (MobiArch 2011). ACM, 11–16.
    [16]
    Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In International Conference on Learning Representations (ICLR 2017).
    [17]
    Johannes Klicpera, Aleksandar Bojchevski, and Stephan Günnemann. 2019. Predict then propagate: Graph neural networks meet personalized pagerank. In International Conference on Learning Representations (ICLR 2019).
    [18]
    Sang Hoon Lee, Pan-Jun Kim, and Hawoong Jeong. 2006. Statistical properties of sampled networks. Physical Review E 73, 1 (2006), 016102.
    [19]
    Jure Leskovec and Christos Faloutsos. 2006. Sampling from large graphs. In Proceedings of the 12th international conference on Knowledge discovery and data mining (KDD 2006). ACM, 631–636.
    [20]
    David D Lewis. 1995. A sequential algorithm for training text classifiers: Corrigendum and additional data. In the ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 1995), Vol. 29. ACM, 13–19.
    [21]
    Qimai Li, Zhichao Han, and Xiao-Ming Wu. 2018. Deeper insights into graph convolutional networks for semi-supervised learning. In Thirty-Second AAAI conference on artificial intelligence(AAAI 2018).
    [22]
    Arun S Maiya and Tanya Y Berger-Wolf. 2011. Benefits of bias: Towards better characterization of network sampling. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD 2011). 105–113.
    [23]
    Galileo Namata, Ben London, Lise Getoor, Bert Huang, and UMD EDU. 2012. Query-driven active surveying for collective classification. In 10th International Workshop on Mining and Learning with Graphs (MLG 2012), Vol. 8.
    [24]
    Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1999. The PageRank citation ranking: Bringing order to the web.Technical Report. Stanford InfoLab.
    [25]
    Jiezhong Qiu, Jian Tang, Hao Ma, Yuxiao Dong, Kuansan Wang, and Jie Tang. 2018. DeepInf: Modeling influence locality in large social networks. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2018).
    [26]
    Alireza Rezvanian, Behnaz Moradabadi, Mina Ghavipour, Mohammad Mehdi Daliri Khomami, and Mohammad Reza Meybodi. 2019. Social network sampling. In Learning Automata Approach for Social Networks. Springer, 91–149.
    [27]
    Bruno Ribeiro and Don Towsley. 2010. Estimating and sampling graphs with multidimensional random walks. In Proceedings of the 10th SIGCOMM conference on Internet measurement (SIGCOMM 2010). ACM, 390–403.
    [28]
    Yu Rong, Wenbing Huang, Tingyang Xu, and Junzhou Huang. 2019. DropEdge: Towards Deep Graph Convolutional Networks on Node Classification. In International Conference on Learning Representations(ICLR 2019).
    [29]
    Carl-Erik Särndal, Bengt Swensson, and Jan Wretman. 2003. Model assisted survey sampling. Springer Science & Business Media.
    [30]
    Jianing Sun, Wei Guo, Dengcheng Zhang, Yingxue Zhang, Florence Regol, Yaochen Hu, Huifeng Guo, Ruiming Tang, Han Yuan, Xiuqiang He, 2020. A Framework for Recommending Accurate and Diverse Items Using Bayesian Graph Convolutional Neural Networks. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD 2020). 2030–2039.
    [31]
    Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE.Journal of machine learning research 9, 11 (2008).
    [32]
    Claudia Wagner, Philipp Singer, Fariba Karimi, Jürgen Pfeffer, and Markus Strohmaier. 2017. Sampling from Social Networks with Attributes. In Proceedings of the 26th International Conference on World Wide Web (WWW 2017). 1181–1190.
    [33]
    Hao Wang, Tong Xu, Qi Liu, Defu Lian, Enhong Chen, Dongfang Du, Han Wu, and Wen Su. 2019. MCNE: An end-to-end framework for learning multiple conditional network representations of social network. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD 2019). 1064–1072.
    [34]
    Felix Wu, Amauri Souza, Tianyi Zhang, Christopher Fifty, Tao Yu, and Kilian Weinberger. 2019. Simplifying graph convolutional networks. In International conference on machine learning (ICML 2019). PMLR, 6861–6871.
    [35]
    Rex Ying, Ruining He, Kaifeng Chen, Pong Eksombatchai, William L Hamilton, and Jure Leskovec. 2018. Graph convolutional neural networks for web-scale recommender systems. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD 2018). 974–983.
    [36]
    Hanqing Zeng, Hongkuan Zhou, Ajitesh Srivastava, Rajgopal Kannan, and Viktor Prasanna. 2019. GraphSAINT: Graph Sampling Based Inductive Learning Method. In International Conference on Learning Representations (ICLR 2019).
    [37]
    Muhan Zhang and Yixin Chen. 2018. Link prediction based on graph neural networks. In Advances in Neural Information Processing Systems (NIPS 2018). 5165–5175.
    [38]
    Lingxiao Zhao and Leman Akoglu. 2019. PairNorm: Tackling Oversmoothing in GNNs. In International Conference on Learning Representations(ICLR 2019).
    [39]
    Jie Zhou, Ganqu Cui, Zhengyan Zhang, Cheng Yang, Zhiyuan Liu, Lifeng Wang, Changcheng Li, and Maosong Sun. 2018. Graph neural networks: A review of methods and applications. arXiv preprint arXiv:1812.08434(2018).
    [40]
    Difan Zou, Ziniu Hu, Yewen Wang, Song Jiang, Yizhou Sun, and Quanquan Gu. 2019. Layer-dependent importance sampling for training deep and large graph convolutional networks. In Advances in Neural Information Processing Systems (NIPS 2019). 11249–11259.

    Cited By

    View all
    • (2024)Coupled Attention Networks for Multivariate Time Series Anomaly DetectionIEEE Transactions on Emerging Topics in Computing10.1109/TETC.2023.328057712:1(240-253)Online publication date: Jan-2024

    Index Terms

    1. Resource-Efficient Training for Large Graph Convolutional Networks with Label-Centric Cumulative Sampling
              Index terms have been assigned to the content through auto-classification.

              Recommendations

              Comments

              Information & Contributors

              Information

              Published In

              cover image ACM Conferences
              WWW '22: Proceedings of the ACM Web Conference 2022
              April 2022
              3764 pages
              ISBN:9781450390965
              DOI:10.1145/3485447
              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

              Sponsors

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              Published: 25 April 2022

              Permissions

              Request permissions for this article.

              Check for updates

              Author Tags

              1. Graph Convolutional Network
              2. Model Training Acceleration
              3. Network Sampling

              Qualifiers

              • Research-article
              • Research
              • Refereed limited

              Conference

              WWW '22
              Sponsor:
              WWW '22: The ACM Web Conference 2022
              April 25 - 29, 2022
              Virtual Event, Lyon, France

              Acceptance Rates

              Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

              Contributors

              Other Metrics

              Bibliometrics & Citations

              Bibliometrics

              Article Metrics

              • Downloads (Last 12 months)53
              • Downloads (Last 6 weeks)2
              Reflects downloads up to 27 Jul 2024

              Other Metrics

              Citations

              Cited By

              View all
              • (2024)Coupled Attention Networks for Multivariate Time Series Anomaly DetectionIEEE Transactions on Emerging Topics in Computing10.1109/TETC.2023.328057712:1(240-253)Online publication date: Jan-2024

              View Options

              Get Access

              Login options

              View options

              PDF

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader

              HTML Format

              View this article in HTML Format.

              HTML Format

              Media

              Figures

              Other

              Tables

              Share

              Share

              Share this Publication link

              Share on social media