research-article

Open access

Adaptive Transfer Learning on Graph Neural Networks

Authors:

Zhenhuan Huang,

Jing BaiAuthors Info & Claims

KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining

Pages 565 - 574

https://doi.org/10.1145/3447548.3467450

Published: 14 August 2021 Publication History

Abstract

Graph neural networks (GNNs) is widely used to learn a powerful representation of graph-structured data. Recent work demonstrates that transferring knowledge from self-supervised tasks to downstream tasks could further improve graph representation. However, there is an inherent gap between self-supervised tasks and downstream tasks in terms of optimization objective and training data. Conventional pre-training methods may be not effective enough on knowledge transfer since they do not make any adaptation for downstream tasks. To solve such problems, we propose a new transfer learning paradigm on GNNs which could effectively leverage self-supervised tasks as auxiliary tasks to help the target task. Our methods would adaptively select and combine different auxiliary tasks with the target task in the fine-tuning stage. We design an adaptive auxiliary loss weighting model to learn the weights of auxiliary tasks by quantifying the consistency between auxiliary tasks and the target task. In addition, we learn the weighting model through meta-learning. Our methods can be applied to various transfer learning approaches, it performs well not only in multi-task learning but also in pre-training and fine-tuning. Comprehensive experiments on multiple downstream tasks demonstrate that the proposed methods can effectively combine auxiliary tasks with the target task and significantly improve the performance compared to state-of-the-art methods.

Supplementary Material

MP4 File (adaptive_transfer_learning_on_graph-xueting_han-zhenhuan_huang-38958024-7rlh.mp4)

Presentation video

Download
35.06 MB

References

[1]

Joan Bruna, Wojciech Zaremba, Arthur Szlam, and Yann LeCun. 2014. Spectral Networks and Locally Connected Networks on Graphs. In ICLR.

[2]

Zhao Chen, Vijay Badrinarayanan, Chen-Yu Lee, and Andrew Rabinovich. 2018. Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In ICML. 794--803.

[3]

Alexandra Chronopoulou, Christos Baziotis, and Alexandros Potamianos. 2019. An embarrassingly simple approach for transfer learning from pretrained language models. In NAACL. 2089--2095.

[4]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL. 4171--4186.

[5]

Yunshu Du, Wojciech M Czarnecki, Siddhant M Jayakumar, Mehrdad Farajtabar, Razvan Pascanu, and Balaji Lakshminarayanan. 2018. Adapting auxiliary losses using gradient similarity. arXiv preprint arXiv:1812.02224 (2018).

[6]

Wenqi Fan, Yao Ma, Qing Li, Yuan He, Eric Zhao, Jiliang Tang, and Dawei Yin. 2019. Graph neural networks for social recommendation. In WWW. 417--426.

[7]

Suchin Gururangan, Ana Marasović, Swabha Swayamdipta, Kyle Lo, Iz Beltagy, Doug Downey, and Noah A Smith. 2020. Don't Stop Pretraining: Adapt Language Models to Domains and Tasks. In ACL. 8342--8360.

[8]

William L. Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive Representation Learning on Large Graphs. In NeurIPS. 1024--1034.

[9]

Jeremy Howard and Sebastian Ruder. 2018. Universal Language Model Fine-tuning for Text Classification. In ACL. 328--339.

[10]

Weihua Hu, Bowen Liu, Joseph Gomes, Marinka Zitnik, Percy Liang, Vijay S. Pande, and Jure Leskovec. 2020 c. Strategies for Pre-training Graph Neural Networks. In ICLR.

[11]

Ziniu Hu, Yuxiao Dong, Kuansan Wang, Kai-Wei Chang, and Yizhou Sun. 2020 b. Gpt-gnn: Generative pre-training of graph neural networks. In KDD. 1857--1867.

[12]

Ziniu Hu, Yuxiao Dong, Kuansan Wang, and Yizhou Sun. 2020 a. Heterogeneous Graph Transformer. In WWW. 2704--2710.

[13]

Dasol Hwang, Jinyoung Park, Sunyoung Kwon, Kyung-Min Kim, Jung-Woo Ha, and Hyunwoo J. Kim. 2020. Self-supervised Auxiliary Learning with Meta-paths for Heterogeneous Graphs. In NeurIPS.

[14]

Wei Jin, Tyler Derr, Haochen Liu, Yiqi Wang, Suhang Wang, Zitao Liu, and Jiliang Tang. 2020. Self-supervised learning on graphs: Deep insights and new direction. arXiv preprint arXiv:2006.10141 (2020).

[15]

Alex Kendall, Yarin Gal, and Roberto Cipolla. 2018. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In CVPR. 7482--7491.

[16]

Thomas N Kipf and Max Welling. 2016. Variational Graph Auto-Encoders. NIPS Workshop on Bayesian Deep Learning (2016).

[17]

Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In ICLR.

[18]

Xingyu Lin, Harjatin Baweja, George Kantor, and David Held. 2019. Adaptive Auxiliary Task Weighting for Reinforcement Learning. In NeurIPS. 4772--4783.

[19]

Shikun Liu, Andrew Davison, and Edward Johns. 2019 a. Self-Supervised Generalisation with Meta Auxiliary Learning. In NeurIPS. 1677--1687.

[20]

Shikun Liu, Edward Johns, and Andrew J Davison. 2019 c. End-to-end multi-task learning with attention. In CVPR. 1871--1880.

[21]

Xiaodong Liu, Pengcheng He, Weizhu Chen, and Jianfeng Gao. 2019 b. Multi-Task Deep Neural Networks for Natural Language Understanding. In ACL. 4487--4496.

[22]

Yuanfu Lu, Xunqiang Jiang, Yuan Fang, and Chuan Shi. 2021. Learning to Pre-train Graph Neural Networks. (2021).

[23]

Sinno Jialin Pan and Qiang Yang. 2009. A survey on transfer learning. IEEE Transactions on knowledge and data engineering, Vol. 22, 10 (2009), 1345--1359.

Digital Library

[24]

Zhen Peng, Yixiang Dong, Minnan Luo, Xiao-Ming Wu, and Qinghua Zheng. 2020. Self-supervised graph representation learning via global context prediction. arXiv preprint arXiv:2003.01604 (2020).

[25]

Fan-Yun Sun, Jordan Hoffmann, Vikas Verma, and Jian Tang. 2020 a. InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization. In ICLR.

[26]

Ke Sun, Zhouchen Lin, and Zhanxing Zhu. 2020 b. Multi-stage self-supervised learning for graph convolutional networks on graphs with few labeled nodes. In AAAI, Vol. 34. 5892--5899.

[27]

Jie Tang, Jing Zhang, Limin Yao, Juanzi Li, Li Zhang, and Zhong Su. 2008. Arnetminer: extraction and mining of academic social networks. In KDD.

[28]

Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2018. Graph Attention Networks. In ICLR.

[29]

Petar Velickovic, William Fedus, William L Hamilton, Pietro Liò, Yoshua Bengio, and R Devon Hjelm. 2019. Deep Graph Infomax. In ICLR.

[30]

Hongwei Wang, Fuzheng Zhang, Jialin Wang, Miao Zhao, Wenjie Li, Xing Xie, and Minyi Guo. 2018. Ripplenet: Propagating user preferences on the knowledge graph for recommender systems. In ACM. 417--426.

Digital Library

[31]

Hongwei Wang, Fuzheng Zhang, Mengdi Zhang, Jure Leskovec, Miao Zhao, Wenjie Li, and Zhongyuan Wang. 2019. Knowledge-aware graph neural networks with label smoothness regularization for recommender systems. In KDD. 968--977.

[32]

Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2019. How Powerful are Graph Neural Networks?. In ICLR.

[33]

Rex Ying, Ruining He, Kaifeng Chen, Pong Eksombatchai, William L. Hamilton, and Jure Leskovec. 2018a. Graph Convolutional Neural Networks for Web-Scale Recommender Systems. In KDD. 974--983.

[34]

Zhitao Ying, Jiaxuan You, Christopher Morris, Xiang Ren, William L. Hamilton, and Jure Leskovec. 2018b. Hierarchical Graph Representation Learning with Differentiable Pooling. In NeurIPS. 4805--4815.

[35]

Yuning You, Tianlong Chen, Zhangyang Wang, and Yang Shen. 2020. When does self-supervision help graph convolutional networks?. In ICML. 10871--10880.

[36]

Fanjin Zhang, Xiao Liu, Jie Tang, Yuxiao Dong, Peiran Yao, Jie Zhang, Xiaotao Gu, Yan Wang, Bin Shao, Rui Li, and Kuansan Wang. 2019. OAG: Toward Linking Large-scale Heterogeneous Entity Graphs. In KDD.

Digital Library

[37]

Muhan Zhang and Yixin Chen. 2018. Link prediction based on graph neural networks. In NeurIPS. 5171--5181.

Cited By

Fang TZhou WSun YHan KMa LYang YSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Exploring correlations of self-supervised tasks for graphsProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692588(12957-12972)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3692588
Wang ZZhang ZZhang CYe YLarson K(2024)Subgraph poolingProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/570(5153-5161)Online publication date: 3-Aug-2024
https://dl.acm.org/doi/10.24963/ijcai.2024/570
Chen JWang FPang STan SChen MZhao TXi MYin JCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)UniGM: Unifying Multiple Pre-trained Graph Models via Adaptive Knowledge AggregationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681018(8556-8565)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681018
Show More Cited By

Index Terms

Adaptive Transfer Learning on Graph Neural Networks
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Multi-task learning
        Transfer learning
    2. Machine learning approaches
      1. Learning latent representations
      2. Neural networks

Recommendations

GPT-GNN: Generative Pre-Training of Graph Neural Networks
KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Graph neural networks (GNNs) have been demonstrated to be powerful in modeling graph-structured data. However, training GNNs requires abundant task-specific labeled data, which is often arduously expensive to obtain. One effective way to reduce the ...
Contrastive fine-tuning for low-resource graph-level transfer learning
Abstract
Due to insufficient supervision and the gap between pre-training pretext tasks and downstream tasks, transferring pre-trained graph neural networks (GNNs) to downstream tasks in low-resource scenarios remains challenging. In this paper, a ...
Transfer learning in neural networks: an experience report
CASCON '17: Proceedings of the 27th Annual International Conference on Computer Science and Software Engineering

Perhaps the most important characteristic of deep neural networks is their ability to discover and extract the necessary features for a particular machine learning task from a raw input representation. This requires a significant time commitment, both ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining

August 2021

4259 pages

ISBN:9781450383325

DOI:10.1145/3447548

General Chairs:
Feida Zhu
Singapore Management University
,
Beng Chin Ooi
National University of Singapore
,
Chunyan Miao
Nanyang Technology University
,
Program Chairs:
Haixun Wang,
Iryna Skrypnyk,
Wynne Hsu,
Sanjay Chawla

Copyright © 2021 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 August 2021

Check for updates

Author Tags

Qualifiers

Research-article

Conference

KDD '21

Sponsor:

KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 14 - 18, 2021

Virtual Event, Singapore

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Sponsor:
sigkdd
sigkdd

The 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 3 - 7, 2025

Toronto , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

24
Total Citations
View Citations
3,618
Total Downloads

Downloads (Last 12 months)909
Downloads (Last 6 weeks)122

Reflects downloads up to 08 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Fang TZhou WSun YHan KMa LYang YSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Exploring correlations of self-supervised tasks for graphsProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692588(12957-12972)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3692588
Wang ZZhang ZZhang CYe YLarson K(2024)Subgraph poolingProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/570(5153-5161)Online publication date: 3-Aug-2024
https://dl.acm.org/doi/10.24963/ijcai.2024/570
Chen JWang FPang STan SChen MZhao TXi MYin JCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)UniGM: Unifying Multiple Pre-trained Graph Models via Adaptive Knowledge AggregationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681018(8556-8565)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681018
Wu STu QLiu HXu JLiu ZZhang GWang RChen XYan RChua TNgo CKa-Wei Lee RKumar RLauw H(2024)Unify Graph Learning with Text: Unleashing LLM Potentials for Session SearchProceedings of the ACM Web Conference 202410.1145/3589334.3645574(1509-1518)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3645574
Qiao ZWang PWang PNing ZFu YDu YZhou YHuang JHua XXiong H(2024)A Dual-channel Semi-supervised Learning Framework on Graphs via Knowledge Transfer and Meta-learningACM Transactions on the Web10.1145/357703318:2(1-26)Online publication date: 8-Jan-2024
https://dl.acm.org/doi/10.1145/3577033
WANG ZDI SCHEN LZHOU X(2024)Search to Fine-Tune Pre-Trained Graph Neural Networks for Graph-Level Tasks2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00219(2805-2819)Online publication date: 13-May-2024
https://doi.org/10.1109/ICDE60146.2024.00219
Hiremath AArya ASriranga LReddy KNikhil M(2024)Ensemble of Graph Neural Networks for Enhanced Financial Fraud Detection2024 IEEE 9th International Conference for Convergence in Technology (I2CT)10.1109/I2CT61223.2024.10543898(1-8)Online publication date: 5-Apr-2024
https://doi.org/10.1109/I2CT61223.2024.10543898
Zeng ZXie JYang ZMa TChen D(2024)TO-UGDA: target-oriented unsupervised graph domain adaptationScientific Reports10.1038/s41598-024-59890-y14:1Online publication date: 22-Apr-2024
https://doi.org/10.1038/s41598-024-59890-y
Fofanah AChen DWen LZhang S(2024)Addressing imbalance in graph datasets: Introducing GATE-GNN with graph ensemble weight attention and transfer learning for enhanced node classificationExpert Systems with Applications10.1016/j.eswa.2024.124602255(124602)Online publication date: Dec-2024
https://doi.org/10.1016/j.eswa.2024.124602
Xie HZheng DMa JZhang HIoannidis VSong XPing QWang SYang CXu YZeng BChilimbi TSingh ASun YAkoglu LGunopulos DYan XKumar ROzcan FYe J(2023)Graph-Aware Language Model Pre-Training on a Large Graph Corpus Can Help Multiple Graph ApplicationsProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599833(5270-5281)Online publication date: 6-Aug-2023
https://dl.acm.org/doi/10.1145/3580305.3599833
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten