research-article

Open access

GraphMAE2: A Decoding-Enhanced Masked Self-Supervised Graph Learner

Authors:

Evgeny Kharlamov,

Jie TangAuthors Info & Claims

WWW '23: Proceedings of the ACM Web Conference 2023

Pages 737 - 746

https://doi.org/10.1145/3543507.3583379

Published: 30 April 2023 Publication History

All formats PDF

Abstract

Graph self-supervised learning (SSL), including contrastive and generative approaches, offers great potential to address the fundamental challenge of label scarcity in real-world graph data. Among both sets of graph SSL techniques, the masked graph autoencoders (e.g., GraphMAE)—one type of generative methods—have recently produced promising results. The idea behind this is to reconstruct the node features (or structures)—that are randomly masked from the input—with the autoencoder architecture. However, the performance of masked feature reconstruction naturally relies on the discriminability of the input features and is usually vulnerable to disturbance in the features. In this paper, we present a masked self-supervised learning framework1 GraphMAE2 with the goal of overcoming this issue. The idea is to impose regularization on feature reconstruction for graph SSL. Specifically, we design the strategies of multi-view random re-mask decoding and latent representation prediction to regularize the feature reconstruction. The multi-view random re-mask decoding is to introduce randomness into reconstruction in the feature space, while the latent representation prediction is to enforce the reconstruction in the embedding space. Extensive experiments show that GraphMAE2 can consistently generate top results on various public datasets, including at least 2.45% improvements over state-of-the-art baselines on ogbn-Papers100M with 111M nodes and 1.6B edges.

References

[1]

Uri Alon and Eran Yahav. 2020. On the bottleneck of graph neural networks and its practical implications. arXiv preprint arXiv:2006.05205 (2020).

[2]

Reid Andersen, Fan Chung, and Kevin Lang. 2006. Local graph partitioning using pagerank vectors. In FOCS. IEEE, 475–486.

[3]

Aleksandar Bojchevski, Johannes Klicpera, Bryan Perozzi, Amol Kapoor, Martin Blais, Benedek Rózemberczki, Michal Lukasik, and Stephan Günnemann. 2020. Scaling graph neural networks with approximate pagerank. In KDD. 2464–2473.

[4]

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, 2020. Language models are few-shot learners. In NeurIPS, Vol. 33.

[5]

Mathilde Caron, Hugo Touvron, Ishan Misra, Hervé Jégou, Julien Mairal, Piotr Bojanowski, and Armand Joulin. 2021. Emerging properties in self-supervised vision transformers. In ICCV. 9650–9660.

[6]

Jie Chen, Tengfei Ma, and Cao Xiao. 2018. Fastgcn: fast learning with graph convolutional networks via importance sampling. In ICLR.

[7]

Wei-Lin Chiang, Xuanqing Liu, Si Si, Yang Li, Samy Bengio, and Cho-Jui Hsieh. 2019. Cluster-gcn: An efficient algorithm for training deep and large graph convolutional networks. In KDD. 257–266.

Digital Library

[8]

Eli Chien, Wei-Cheng Chang, Cho-Jui Hsieh, Hsiang-Fu Yu, Jiong Zhang, Olgica Milenkovic, and Inderjit S Dhillon. 2022. Node Feature Extraction by Self-Supervised Multi-scale Neighborhood Prediction. In ICLR.

[9]

Ganqu Cui, Jie Zhou, Cheng Yang, and Zhiyuan Liu. 2020. Adaptive graph encoder for attributed graph embedding. In KDD. 976–985.

[10]

Wenzheng Feng, Jie Zhang, Yuxiao Dong, Yu Han, Huanbo Luan, Qian Xu, Qiang Yang, Evgeny Kharlamov, and Jie Tang. 2020. Graph random neural networks for semi-supervised learning on graphs. In NeurIPS.

[11]

Fabrizio Frasca, Emanuele Rossi, Davide Eynard, Ben Chamberlain, Michael Bronstein, and Federico Monti. 2020. Sign: Scalable inception graph neural networks. arXiv preprint arXiv:2004.11198 (2020).

[12]

Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre H Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Avila Pires, Zhaohan Daniel Guo, Mohammad Gheshlaghi Azar, 2020. Bootstrap your own latent: A new approach to self-supervised learning. In NeurIPS.

[13]

Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In NeurIPS.

[14]

Kaveh Hassani and Amir Hosein Khasahmadi. 2020. Contrastive multi-view representation learning on graphs. In ICML.

[15]

Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, and Ross Girshick. 2022. Masked autoencoders are scalable vision learners. In CVPR.

[16]

Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. 2020. Momentum contrast for unsupervised visual representation learning. In CVPR.

[17]

Geoffrey E Hinton and Richard Zemel. 1993. Autoencoders, Minimum Description Length and Helmholtz Free Energy. In NeurIPS, J. Cowan, G. Tesauro, and J. Alspector (Eds.). Vol. 6. Morgan-Kaufmann.

[18]

Zhenyu Hou, Xiao Liu, Yukuo Cen, Yuxiao Dong, Hongxia Yang, Chunjie Wang, and Jie Tang. 2022. GraphMAE: Self-Supervised Masked Graph Autoencoders. In KDD.

[19]

Weihua Hu, Matthias Fey, Hongyu Ren, Maho Nakata, Yuxiao Dong, and Jure Leskovec. 2021. Ogb-lsc: A large-scale challenge for machine learning on graphs. arXiv preprint arXiv:2103.09430 (2021).

[20]

Weihua Hu, Matthias Fey, Marinka Zitnik, Yuxiao Dong, Hongyu Ren, Bowen Liu, Michele Catasta, and Jure Leskovec. 2020. Open graph benchmark: Datasets for machine learning on graphs. In NeurIPS.

[21]

Weihua Hu, Bowen Liu, Joseph Gomes, Marinka Zitnik, Percy Liang, Vijay Pande, and Jure Leskovec. 2019. Strategies for pre-training graph neural networks. In ICLR.

[22]

Ziniu Hu, Yuxiao Dong, Kuansan Wang, Kai-Wei Chang, and Yizhou Sun. 2020. Gpt-gnn: Generative pre-training of graph neural networks. In KDD.

Digital Library

[23]

Lucas GS Jeub, Prakash Balachandran, Mason A Porter, Peter J Mucha, and Michael W Mahoney. 2015. Think locally, act locally: Detection of small, medium-sized, and large communities in large networks. Physical Review E 91, 1 (2015).

[24]

Thomas N Kipf and Max Welling. 2016. Variational graph auto-encoders. arXiv preprint arXiv:1611.07308 (2016).

[25]

Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In ICLR.

[26]

Jure Leskovec, Kevin J Lang, Anirban Dasgupta, and Michael W Mahoney. 2009. Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters. Internet Mathematics 6, 1 (2009), 29–123.

[27]

Guohao Li, Matthias Muller, Ali Thabet, and Bernard Ghanem. 2019. Deepgcns: Can gcns go as deep as cnns¿. In ICCV. 9267–9276.

[28]

Timothy P Lillicrap, Jonathan J Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. 2015. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015).

[29]

Xiao Liu, Haoyun Hong, Xinghao Wang, Zeyi Chen, Evgeny Kharlamov, Yuxiao Dong, and Jie Tang. 2022. Selfkg: self-supervised entity alignment in knowledge graphs. In WWW. 860–870.

[30]

Xiao Liu, Fanjin Zhang, Zhenyu Hou, Li Mian, Zhaoyu Wang, Jing Zhang, and Jie Tang. 2021. Self-supervised learning: Generative or contrastive. TKDE (2021).

[31]

Xiao Liu, Shiyu Zhao, Kai Su, Yukuo Cen, Jiezhong Qiu, Mengdi Zhang, Wei Wu, Yuxiao Dong, and Jie Tang. 2022. Mask and Reason: Pre-Training Knowledge Graph Transformers for Complex Logical Queries. In KDD. 1120–1130.

[32]

Yao Ma, Xiaorui Liu, Tong Zhao, Yozen Liu, Jiliang Tang, and Neil Shah. 2021. A unified view on graph neural networks as graph signal denoising. In CIKM.

[33]

Shirui Pan, Ruiqi Hu, Guodong Long, Jing Jiang, Lina Yao, and Chengqi Zhang. 2018. Adversarially regularized graph autoencoder for graph embedding. In IJCAI.

[34]

Jiwoong Park, Minsik Lee, Hyung Jin Chang, Kyuewang Lee, and Jin Young Choi. 2019. Symmetric graph convolutional autoencoder for unsupervised graph representation learning. In ICCV. 6519–6528.

[35]

Jiezhong Qiu, Qibin Chen, Yuxiao Dong, Jing Zhang, Hongxia Yang, Ming Ding, Kuansan Wang, and Jie Tang. 2020. Gcc: Graph contrastive coding for graph neural network pre-training. In SIGKDD.

Digital Library

[36]

Daniel A Spielman and Shang-Hua Teng. 2013. A local clustering algorithm for massive graphs and its application to nearly linear time graph partitioning. SIAM Journal on computing 42, 1 (2013), 1–26.

[37]

Fan-Yun Sun, Jordan Hoffmann, Vikas Verma, and Jian Tang. 2020. Infograph: Unsupervised and semi-supervised graph-level representation learning via mutual information maximization. In ICLR’20.

[38]

Mingyue Tang, Carl Yang, and Pan Li. 2022. Graph Auto-Encoder via Neighborhood Wasserstein Reconstruction. arXiv preprint arXiv:2202.09025 (2022).

[39]

Shantanu Thakoor, Corentin Tallec, Mohammad Gheshlaghi Azar, Rémi Munos, Petar Veličković, and Michal Valko. 2022. Large-Scale Representation Learning on Graphs via Bootstrapping. In ICLR.

[40]

Puja Trivedi, Ekdeep Singh Lubana, Yujun Yan, Yaoqing Yang, and Danai Koutra. 2022. Augmentations in graph contrastive learning: Current methodological flaws & towards better practices. In WWW. 1538–1549.

[41]

Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2018. Graph Attention Networks. In ICLR.

[42]

Petar Veličković, William Fedus, William L Hamilton, Pietro Liò, Yoshua Bengio, and R Devon Hjelm. 2018. Deep Graph Infomax. In ICLR.

[43]

Chun Wang, Shirui Pan, Guodong Long, Xingquan Zhu, and Jing Jiang. 2017. Mgae: Marginalized graph autoencoder for graph clustering. In CIKM. 889–898.

Digital Library

[44]

Haonan Wang, Jieyu Zhang, Qi Zhu, and Wei Huang. 2022. Augmentation-Free Graph Contrastive Learning. In AAAI.

[45]

Chen Wei, Haoqi Fan, Saining Xie, Chao-Yuan Wu, Alan Yuille, and Christoph Feichtenhofer. 2022. Masked feature prediction for self-supervised visual pre-training. In CVPR. 14668–14678.

[46]

Felix Wu, Amauri Souza, Tianyi Zhang, Christopher Fifty, Tao Yu, and Kilian Weinberger. 2019. Simplifying graph convolutional networks. In ICML. PMLR.

[47]

Jun Xia, Lirong Wu, Jintao Chen, Bozhen Hu, and Stan Z Li. 2022. SimGRACE: A Simple Framework for Graph Contrastive Learning without Data Augmentation. In WWW. 1070–1079.

[48]

Dongkuan Xu, Wei Cheng, Dongsheng Luo, Haifeng Chen, and Xiang Zhang. 2021. Infogcl: Information-aware graph contrastive learning. NeurIPS 34 (2021).

[49]

Zhilin Yang, William Cohen, and Ruslan Salakhudinov. 2016. Revisiting semi-supervised learning with graph embeddings. In ICML. PMLR, 40–48.

[50]

Hao Yin, Austin R Benson, Jure Leskovec, and David F Gleich. 2017. Local higher-order graph clustering. In KDD. 555–564.

[51]

Yuning You, Tianlong Chen, Yongduo Sui, Ting Chen, Zhangyang Wang, and Yang Shen. 2020. Graph contrastive learning with augmentations. In NeurIPS.

[52]

Hanqing Zeng, Muhan Zhang, Yinglong Xia, Ajitesh Srivastava, Andrey Malevich, Rajgopal Kannan, Viktor Prasanna, Long Jin, and Ren Chen. 2021. Decoupling the depth and scope of graph neural networks. In NeurIPS.

[53]

Hanqing Zeng, Hongkuan Zhou, Ajitesh Srivastava, Rajgopal Kannan, and Viktor Prasanna. 2020. Graphsaint: Graph sampling based inductive learning method. In ICLR.

[54]

Hengrui Zhang, Qitian Wu, Junchi Yan, David Wipf, and Philip S Yu. 2021. From canonical correlation analysis to self-supervised graph neural networks. In NeurIPS.

[55]

Yizhen Zheng, Shirui Pan, Vincent Cs Lee, Yu Zheng, and Philip S Yu. 2022. Rethinking and Scaling Up Graph Contrastive Learning: An Extremely Efficient Approach with Group Discrimination. In NeurIPS.

[56]

Jinghao Zhou, Chen Wei, Huiyu Wang, Wei Shen, Cihang Xie, Alan Yuille, and Tao Kong. 2022. ibot: Image bert pre-training with online tokenizer. In ICLR.

[57]

Yanqiao Zhu, Yichen Xu, Feng Yu, Qiang Liu, Shu Wu, and Liang Wang. 2020. Deep graph contrastive representation learning. arXiv preprint arXiv:2006.04131 (2020).

[58]

Yanqiao Zhu, Yichen Xu, Feng Yu, Qiang Liu, Shu Wu, and Liang Wang. 2021. Graph contrastive learning with adaptive augmentation. In WWW. 2069–2080.

[59]

Zeyuan Allen Zhu, Silvio Lattanzi, and Vahab Mirrokni. 2013. A local algorithm for finding well-connected clusters. In ICML. PMLR, 396–404.

[60]

Difan Zou, Ziniu Hu, Yewen Wang, Song Jiang, Yizhou Sun, and Quanquan Gu. 2019. Layer-dependent importance sampling for training deep and large graph convolutional networks. In NeurIPS.

Cited By

Zheng YJia C(2024)ProtoMGAE: Prototype-Aware Masked Graph Auto-Encoder for Graph Representation LearningACM Transactions on Knowledge Discovery from Data10.1145/364914318:6(1-22)Online publication date: 12-Apr-2024
https://dl.acm.org/doi/10.1145/3649143
Zhao HYang BCen YRen JZhang CDong YKharlamov EZhao STang JBaeza-Yates RBonchi F(2024)Pre-Training and Prompting for Few-Shot Node Classification on Text-Attributed GraphsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671952(4467-4478)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671952
Zhao HChen ASun XCheng HLi JBaeza-Yates RBonchi F(2024)All in One and One for All: A Simple yet Effective Method towards Cross-domain Graph PretrainingProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671913(4443-4454)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671913
Show More Cited By

Index Terms

GraphMAE2: A Decoding-Enhanced Masked Self-Supervised Graph Learner
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Learning latent representations
2. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

Node and edge dual-masked self-supervised graph representation
Abstract
Self-supervised graph representation learning has been widely used in many intelligent applications since labeled information can hardly be found in these data environments. Currently, masking and reconstruction-based (MR-based) methods lead the ...
JGCL: Joint Self-Supervised and Supervised Graph Contrastive Learning
WWW '22: Companion Proceedings of the Web Conference 2022

Semi-supervised and self-supervised learning on graphs are two popular avenues for graph representation learning. We demonstrate that no single method from semi-supervised and self-supervised learning works uniformly well for all settings in the node ...
GraphMAE: Self-Supervised Masked Graph Autoencoders
KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Self-supervised learning (SSL) has been extensively explored in recent years. Particularly, generative SSL has seen emerging success in natural language processing and other fields, such as the wide adoption of BERT and GPT. Despite this, contrastive ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

WWW '23: Proceedings of the ACM Web Conference 2023

April 2023

4293 pages

ISBN:9781450394161

DOI:10.1145/3543507

Copyright © 2023 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 April 2023

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Natural Science Foundation of China

Conference

WWW '23

Sponsor:

SIGWEB

WWW '23: The ACM Web Conference 2023

April 30 - May 4, 2023

TX, Austin, USA

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

18
Total Citations
View Citations
1,544
Total Downloads

Downloads (Last 12 months)1,148
Downloads (Last 6 weeks)161

Reflects downloads up to 03 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zheng YJia C(2024)ProtoMGAE: Prototype-Aware Masked Graph Auto-Encoder for Graph Representation LearningACM Transactions on Knowledge Discovery from Data10.1145/364914318:6(1-22)Online publication date: 12-Apr-2024
https://dl.acm.org/doi/10.1145/3649143
Zhao HYang BCen YRen JZhang CDong YKharlamov EZhao STang JBaeza-Yates RBonchi F(2024)Pre-Training and Prompting for Few-Shot Node Classification on Text-Attributed GraphsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671952(4467-4478)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671952
Zhao HChen ASun XCheng HLi JBaeza-Yates RBonchi F(2024)All in One and One for All: A Simple yet Effective Method towards Cross-domain Graph PretrainingProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671913(4443-4454)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671913
Kim JKim EYeo KJeon YKim CLee SLee JHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)Content-based Graph Reconstruction for Cold-start Item RecommendationProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657801(1263-1273)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657801
Zhao YZhang HBai QNie CYuan XHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)DHMAE: A Disentangled Hypergraph Masked Autoencoder for Group RecommendationProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657699(914-923)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657699
Zhang ZZhang MYu YYang CLiu JShi CChua TNgo CKa-Wei Lee RKumar RLauw H(2024)Endowing Pre-trained Graph Models with Provable FairnessProceedings of the ACM Web Conference 202410.1145/3589334.3645703(1045-1056)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3645703
Zhao ZLi YZou YTang JLi RChua TNgo CKa-Wei Lee RKumar RLauw H(2024)Masked Graph Autoencoder with Non-discrete BandwidthsProceedings of the ACM Web Conference 202410.1145/3589334.3645370(377-388)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3645370
Weijler LKowarsch FReiter MHermosilla PMaurer-Granofszky MDworzak M(2024)FATE: Feature-Agnostic Transformer-based Encoder for learning generalized embedding spaces in flow cytometry data2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00777(7941-7949)Online publication date: 3-Jan-2024
https://doi.org/10.1109/WACV57701.2024.00777
Liu SWang ZXu JLi HSun JLi Y(2024)Multi-Faceted Negative Sample Mining for Grpah Contrastive Learning2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10650308(1-8)Online publication date: 30-Jun-2024
https://doi.org/10.1109/IJCNN60899.2024.10650308
Huang YXiao N(2024)Self-Supervised Masked Hypergraph Autoencoders for Spatio-Temporal Forecasting2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10650073(1-8)Online publication date: 30-Jun-2024
https://doi.org/10.1109/IJCNN60899.2024.10650073
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents