short-paper

Self-supervised Learning and Graph Classification under Heterophily

Authors:

Hao HaoAuthors Info & Claims

CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management

Pages 3849 - 3853

https://doi.org/10.1145/3583780.3615166

Published: 21 October 2023 Publication History

Abstract

Most existing pre-training strategies usually choose the popular Graph Neural Networks (GNNs), which can be seen as a special form of low-pass filter, but fail to effectively capture heterophily. In this paper, we first present an experimental investigation exploring the performance of low-pass and high-pass filters in heterophily graph classification, where the results clearly show that high-frequency signal is important for learning heterophily graph representation. In addition, it is still unclear how to effectively capture the structural pattern of graphs and how to measure the capability of the self-supervised pre-training strategy in capturing graph structure. To address the problem, we first design a quantitative Metric for Graph Structure (MGS), which analyzes the correlation between structural similarity and embedding similarity of graph pairs. Then, to enhance the graph structural information captured by self-supervised learning, we propose a novel self-supervised strategy for Pre-training GNNs based on the Metric (PGM). Extensive experiments validate our pre-training strategy achieves state-of-the-art performance for molecular property prediction and protein function prediction. In addition, we find choosing a suitable filter sometimes may be better than designing good pre-training strategies for heterophily graph classification.

References

[1]

Deyu Bo, Xiao Wang, Chuan Shi, and Huawei Shen. 2021. Beyond low-frequency information in graph convolutional networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 3950--3957.

[2]

Michaël Defferrard, Xavier Bresson, and Pierre Vandergheynst. 2016. Convolutional neural networks on graphs with fast localized spectral filtering. Advances in neural information processing systems, Vol. 29 (2016).

[3]

L. William Hamilton, Rex Ying, and Jure Leskovec. 2017a. Inductive Representation Learning on Large Graphs. In Advances in Neural Information Processing Systems (NeurIPS) (2017), 1024--1034.

[4]

William L Hamilton, Rex Ying, and Jure Leskovec. 2017b. Representation learning on graphs: Methods and applications. arXiv preprint arXiv:1709.05584 (2017).

[5]

Weihua Hu, Bowen Liu, Joseph Gomes, Marinka Zitnik, Percy Liang, Vijay Pande, and Jure Leskovec. 2019. Strategies for pre-training graph neural networks. arXiv preprint arXiv:1905.12265 (2019).

[6]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[7]

Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).

[8]

Greg Landrum. 2013. Rdkit documentation. Release, Vol. 1, 1--79 (2013), 4.

[9]

Pengyong Li, Jun Wang, Ziliang Li, Yixuan Qiao, Xianggen Liu, Fei Ma, Peng Gao, Sen Song, and Guotong Xie. 2021. Pairwise Half-graph Discrimination - A Simple Graph-level Self-supervised Strategy for Pre-training Graph Neural Networks. In International Joint Conference on Artificial Intelligence (IJCAI) (2021), 2694--2700.

[10]

Qimai Li, Zhichao Han, and Xiao-Ming Wu. 2018. Deeper insights into graph convolutional networks for semi-supervised learning. In AAAI Conference on Artificial Intelligence (AAAI), 3538--3545.

[11]

Yao Ma, Xiaorui Liu, Neil Shah, and Jiliang Tang. 2021. Is homophily a necessity for graph neural networks? arXiv preprint arXiv:2106.06134 (2021).

[12]

Leland McInnes, John Healy, and James Melville. 2018. Umap: Uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426 (2018).

[13]

Hongbin Pei, Bingzhe Wei, Kevin Chen-Chuan Chang, Yu Lei, and Bo Yang. 2020. Geom-gcn: Geometric graph convolutional networks. arXiv preprint arXiv:2002.05287 (2020).

[14]

Filipe Rodrigues Leonardo Ribeiro, H. P. Pedro Saverese, and R. Daniel Figueiredo. 2017. struc2vec: Learning Node Representations from Structural Identity. In ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) (2017), 385--394.

[15]

Yu Rong, Yatao Bian, Tingyang Xu, Weiyang Xie, Ying Wei, Wenbing Huang, and Junzhou Huang. 2020. Self-Supervised Graph Transformer on Large-Scale Molecular Data. In Advances in Neural Information Processing Systems (NeurIPS), Vol. 33 (2020), 12559--12571.

[16]

Teague Sterling and J John Irwin. 2015. ZINC 15 - Ligand Discovery for Everyone. Journal of Chemical Information and Modeling (2015), 2324--2337.

[17]

Petar Velivc ković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2017. Graph Attention Networks. arXiv preprint arXiv:1710.10903 (2017).

[18]

Petar Velivc ković, William Fedus, William L Hamilton, Pietro Liò, Yoshua Bengio, and R Devon Hjelm. 2018. Deep Graph Infomax. arXiv preprint arXiv:1809.10341 (2018).

[19]

Zhenqin Wu, Bharath Ramsundar, N. Evan Feinberg, Joseph Gomes, Caleb Geniesse, S. Aneesh Pappu, Karl Leswing, and S. Vijay Pande. 2018. MoleculeNet: A Benchmark for Molecular Machine Learning. Chemical science (2018), 513--530.

[20]

Bingbing Xu, Huawei Shen, Qi Cao, Keting Cen, and Xueqi Cheng. 2020. Graph convolutional networks using heat kernel for semi-supervised learning. arXiv preprint arXiv:2007.16002 (2020).

[21]

Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2019. How Powerful are Graph Neural Networks? In International Conference on Learning Representations (ICLR) (2019).

[22]

Jiong Zhu, Ryan A Rossi, Anup Rao, Tung Mai, Nedim Lipka, Nesreen K Ahmed, and Danai Koutra. 2021. Graph neural networks with heterophily. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 11168--11176.

[23]

Marinka Zitnik, Rok Sosivc, Marcus W Feldman, and Jure Leskovec. 2019. Evolution of resilience in protein interactomes across the tree of life. Proceedings of the National Academy of Sciences, Vol. 116, 10 (2019), 4426--4433.

Recommendations

Towards Self-supervised Learning on Graphs with Heterophily
CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge Management

Recently emerged heterophilous graph neural networks have significantly reduced the reliance on the assumption of graph homophily where linked nodes have similar features and labels. These methods focus on a supervised setting that relies on labeling ...
Revisiting the Role of Heterophily in Graph Representation Learning: An Edge Classification Perspective
Graph representation learning aims at integrating node contents with graph structure to learn nodes/graph representations. Nevertheless, it is found that many existing graph learning methods do not work well on data with high heterophily level that ...
Addressing Heterophily in Graph Anomaly Detection: A Perspective of Graph Spectrum
WWW '23: Proceedings of the ACM Web Conference 2023

Graph anomaly detection (GAD) suffers from heterophily — abnormal nodes are sparse so that they are connected to vast normal nodes. The current solutions upon Graph Neural Networks (GNNs) blindly smooth the representation of neiboring nodes, thus ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management

October 2023

5508 pages

ISBN:9798400701245

DOI:10.1145/3583780

General Chairs:
Ingo Frommholz
University of Wolverhampton, UK
,
Frank Hopfgartner
University of Koblenz, Germany
,
Mark Lee
University of Birmingham, UK
,
Michael Oakes
University of Birmingham, UK
,
Program Chairs:
Mounia Lalmas
Spotify, UK
,
Min Zhang
Tsinghua University, China
,
Rodrygo Santos
Federal University of Minas Gerais, Brazil

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 October 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Funding Sources

National Key R&D Program of China
National Natural Science Foundation of China

Conference

CIKM '23

Sponsor:

CIKM '23: The 32nd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2023

Birmingham, United Kingdom

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
130
Total Downloads

Downloads (Last 12 months)84
Downloads (Last 6 weeks)8

Reflects downloads up to 22 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents