Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3589335.3651920acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article
Open access

Graph Coarsening via Convolution Matching for Scalable Graph Neural Network Training

Published: 13 May 2024 Publication History

Abstract

Graph summarization as a preprocessing step is an effective and complementary technique for scalable graph neural network (GNN) training. In this work, we propose the Coarsening Via Convolution Matching (ConvMatch) algorithm and a highly scalable variant, A-ConvMatch, for creating summarized graphs that preserve the output of graph convolution. We evaluate ConvMatch on six real-world link prediction and node classification graph datasets, and show it is efficient and preserves prediction performance while significantly reducing the graph size. Notably, ConvMatch achieves up to 95% of the prediction performance of GNNs on node classification while trained on graphs summarized down to 1% the size of the original graph. Furthermore, on link prediction tasks, ConvMatch consistently outperforms all baselines, achieving up to a 2X improvement.

Supplemental Material

MOV File
Supplemental video

References

[1]
Esra Akbas and Mehmet Emin Aktas. 2019. Network Embedding: On Compression and Learning. In IEEE Transactions on Big Data.
[2]
Ondrej Bohdal, Yongxin Yang, and Timothy Hospedales. 2020. Flexible Dataset Distillation: Learn Labels Instead of Images. In Arxiv preprint.
[3]
Joan Bruna, Wojciech Zaremba, Arthur Szlam, and Yann Lecun. 2014. Spectral networks and Locally Connected Networks on Graphs. In International Conference on Learning Representations (ICLR).
[4]
Michaël Defferrard, Xavier Bresson, and Pierre Vandergheynst. 2016. Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering. In Advances in Neural Information Processing Systems (NIPS).
[5]
Chenhui Deng, Zhiqiang Zhao, Yongyu Wang, Zhiru Zhang, and Zhuo Feng. 2020. GraphZoom: A Multi-Level Spectral Approach for Accurate and Scalable Graph Embedding. In International Conference on Learning Representations (ICLR).
[6]
Cody Dunne and Ben Schneiderman. 2013. Motif Simplification: Improving Network Visualization Readability with Fan, Connector, and Clique Glyphs. In ACM SIGCHI Conference on Human Factors in Computing Systems (CHI).
[7]
Matthias Englert, Anupam Gupta, Robert Krauthgamer, Harald Racke, Inbal Talgam-Cohen, and Kunal Talwar. 2014. Vertex Sparsifiers: New Results From Old Techniques. SIAM J. Comput., Vol. 43, 4 (2014), 1239--1262.
[8]
Matthew Fahrbach, Gramoz Goranci, Richard Peng, Sushant Sachdeva, and Chi Wang. 2020. Faster Graph Embeddings Via Coarsening. In International Conference on Machine Learning (ICML).
[9]
Matthias Fey and Jan E. Lenssen. 2019. Fast Graph Representation Learning with PyTorch Geometric. In ICLR Workshop on Representation Learning on Graphs and Manifolds.
[10]
Swapnil Gandhi and Anand Padmanabha Iyer. 2021. P3: Distributed deep graph learning at scale. In 15th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI}21). 551--568.
[11]
Sariel Har-Peled and Akash Kushal. 2005. Smaller Coresets for K-Mediam and K-Means Clustering. In ACM Annual Symposium on Computational Geometry (ACG).
[12]
David Harel and Yehuda Koren. 2002. A Fast Multi-scale Method for Drawing Large Graphs. Journal of Graph Algorithms and Applications, Vol. 6, 3 (2002), 179--202.
[13]
Weihua Hu, Matthias Fey, Marinka Zitnik, Yuxiao Dong, Hongyu Ren, Bowen Liu, Michele Catasta, and Jure Leskovec. 2020a. Open graph benchmark: Datasets for machine learning on graphs. Advances in neural information processing systems, Vol. 33 (2020), 22118--22133.
[14]
Weihua Hu, Matthias Fey, Marinka Zitnk, Yuxiao Dong, Hongyu Ren, Bowen Liu, Michele Catasta, and Jure Leskovec. 2020b. Open Graph Benchmark: Datasets for Machine Learning on Graphs. In Conference on Neural Information Processing Systems (NeurIPS).
[15]
Zengfeng Huang, Shengzhong Zhang, Chong Xi, Tang Liu, and Min Zhou. 2021. Scaling Up Graph Neural Networks Via Graph Coarsening. In ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD).
[16]
Peng Jiang and Masuma Akter Rumi. 2021. Communication-efficient sampling for distributed training of graph convolutional networks. arXiv preprint arXiv:2101.07706 (2021).
[17]
Wei Jin, Xianfeng Tang, Haoming Jiang, Zheng Li, Danqing Zhang, Jiliang Tang, and Bing Yin. 2022a. Condensing Graphs via One-Step Gradient Matching. In ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD).
[18]
Wei Jin, Lingxiao Zhao, Schichang Zhang, Yozen Liu, Jiliang Tang, and Neil Shah. 2022b. Graph Condensation for Graph Neural Networks. In International Conference on Learning Representations (ICLR).
[19]
Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In International Conference on Learning Representations (ICLR).
[20]
Manoj Kumar, Anurag Sharma, Shashwat Saxena, and Sandeep Kumar. 2023. Featured Graph Coarsening with Similarity Guarantees. In International Conference on Machine Learning (ICML).
[21]
Ron Levie, Federico Monti, Xavier Bresson, and Michael M. Bronstein. 2017. CayleyNets: Graph Convolutional Neural Networks with Complex Rational Spectral Filters. IEEE Transactions on Signal Processing, Vol. 67, 1 (2017), 97--109.
[22]
Jiongqian Liang, Saket Gurukar, and Srinivasan Parthasarathy. 2021. MILE: A Multi-Level Framework for Scalable Graph Embedding. In International AAAI Conference on Web and Social Media (ICWSM).
[23]
Mengyang Liu, Shanchuan Li, Xinshi Chen, and Le Song. 2022. Graph Condensation via Receptive Field Distribution Matching. In Arxiv preprint.
[24]
Yike Liu, Tara Safavi, Abhilash Dighe, and Danai Koutra. 2019. Graph Summarization Methods and Applications: A Survey. Comput. Surveys, Vol. 51, 62 (2019).
[25]
Andreas Loukas. 2019. Graph Reduction with Spectral and Cut Guarantees. Journal of Machine Learning Research, Vol. 20, 116 (2019), 1--42.
[26]
Baharan Mirzasoleiman, Jeff Bilmes, and Jure Leskovec. 2020. Coreset for Data-efficient Training of Machine Learning Models. In International Conference on Machine Learning (ICML).
[27]
Ankur Moitra. 2009. Approximation Algorithms for Multicommodity-type Problems with Guarantees Independent of the Graph Size. In IEEE Symposium on Foundation of Computer Science.
[28]
Timothy Nguyen, Zhourong Chen, and Jaehoon Lee. 2021. Dataset Meta-Learning from Kernel Ridge-Regression. In International Conference on Learning Representations (ICLR).
[29]
Manish Purohit, B. Aditya Prakash, Chanhyun Kang, Yao Zhang, and V. S. Subrahmanian. 2014. Fast Influence-based Coarsening for Large Networks. In ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD).
[30]
Morteza Ramezani, Weilin Cong, Mehrdad Mahdavi, Mahmut Kandemir, and Anand Sivasubramaniam. 2021. Learn Locally, Correct Globally: A Distributed Algorithm for Training Graph Neural Networks. In International Conference on Learning Representations.
[31]
Prithviraj Sen, Galileo Namata, Mustafa Bilgic, Lise Getoor, Brian Galligher, and Tina Eliassi-Rad. 2008. Collective Classification in Network Data. AI Magazine, Vol. 29, 3 (2008).
[32]
Ozan Sener and Silvio Savarses. 2018. Active Learning for Convolutional Neural Networks: A Core-Set Approach. In International Conference on Learning Representations (ICLR).
[33]
Ilia Sucholutsky and Matthias Schonlau. 2020. Soft-Label Dataset Distillation and Text Dataset Distillation. In Arxiv preprint.
[34]
Ivor W. Tsang, James T. Kwok, and Pak-Min Cheung. 2005. Core Vector Machines: Fast SVM Training on Very Large Data Sets. Journal of Machine Learning Research, Vol. 6 (2005), 363--392.
[35]
Chris Walshaw. 2006. A Multilevel Algorithm for Force-Directed Graph Drawing. Journal of Graph Algorithms and Applications, Vol. 7, 3 (2006), 253--285.
[36]
Minjie Wang, Da Zheng, Zihao Ye, Quan Gan, Mufei Li, Xiang Song, Jinjing Zhou, Chao Ma, Lingfan Yu, Yu Gai, Tianjun Xiao, Tong He, George Karypis, Jinyang Li, and Zheng Zhang. 2019. Deep Graph Library: A Graph-Centric, Highly-Performant Package for Graph Neural Networks. arXiv preprint arXiv:1909.01315 (2019).
[37]
Tongzhou Wang, Jun-Yan Zhu, Antonio Torralba, and Alexei A. Efros. 2018. Dataset Distillation. In Arxiv preprint.
[38]
Max Welling. 2009. Herding Dynamical Weights to Learn. In International Conference on Machine Learning (ICML).
[39]
Felix Wu, Tianyi Zhang, Amauri Holanda de Souza Jr., Christopher Fifty, Tao Yu, and Kilian Q. Wienberger. 2019. Simplifying Graph Convolutional Networks. In International Conference on Machine Learning (ICML).
[40]
Zhengyan Zhang, Cheng Yang, Zhiyuan Liu, Maosong Sun, Zhichong Fang, Bo Zhang, and Leyu Lin. 2020. COSINE: Compressive Network Embedding on Large-Scale Information Networks. In IEEE Transactions on Knowledge and Data Engineering.
[41]
Bo Zhao and Hakan Bilen. 2021. Dataset Condensation with Differentiable Siamese Augmentation. In International Conference on Machine Learning (ICML).
[42]
Bo Zhao, Konda Reddy Mopuri, and Hakan Bilen. 2021. Dataset Condensation with Gradient Matching. In International Conference on Learning Representations (ICLR).
[43]
Da Zheng, Chao Ma, Minjie Wang, Jinjing Zhou, Qidong Su, Xiang Song, Quan Gan, Zheng Zhang, and George Karypis. 2020. Distdgl: distributed graph neural network training for billion-scale graphs. In 2020 IEEE/ACM 10th Workshop on Irregular Applications: Architectures and Algorithms (IA3). IEEE, 36--44.
[44]
Da Zheng, Xiang Song, Chengru Yang, Dominique LaSalle, and George Karypis. 2021. Distributed Hybrid CPU and GPU training for Graph Neural Networks on Billion-Scale Graphs. arXiv preprint arXiv:2112.15345 (2021).
[45]
Jiong Zhu, Aishwarya Naresh Reganti, Edward W Huang, Charles Andrew Dickens, Nikhil Rao, Karthik Subbian, and Danai Koutra. 2023. Simplifying Distributed Neural Network Training on Massive Graphs: Randomized Partitions Improve Model Aggregation. In ICML Workshop on Localized Learning (LLW).

Index Terms

  1. Graph Coarsening via Convolution Matching for Scalable Graph Neural Network Training

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      WWW '24: Companion Proceedings of the ACM Web Conference 2024
      May 2024
      1928 pages
      ISBN:9798400701726
      DOI:10.1145/3589335
      This work is licensed under a Creative Commons Attribution International 4.0 License.

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 13 May 2024

      Check for updates

      Author Tags

      1. graph convolutions
      2. graph neural networks
      3. graph summarization

      Qualifiers

      • Research-article

      Conference

      WWW '24
      Sponsor:
      WWW '24: The ACM Web Conference 2024
      May 13 - 17, 2024
      Singapore, Singapore

      Acceptance Rates

      Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 447
        Total Downloads
      • Downloads (Last 12 months)447
      • Downloads (Last 6 weeks)84
      Reflects downloads up to 23 Dec 2024

      Other Metrics

      Citations

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media