Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3485447.3512162acmconferencesArticle/Chapter ViewAbstractPublication PageswebconfConference Proceedingsconference-collections
research-article

Graph Neural Transport Networks with Non-local Attentions for Recommender Systems

Published: 25 April 2022 Publication History
  • Get Citation Alerts
  • Abstract

    Graph Neural Networks (GNNs) have emerged as powerful tools for collaborative filtering. A key challenge of recommendations is to distill long-range collaborative signals from user-item graphs. Typically, GNNs generate embeddings of users/items by propagating and aggregating the messages between local neighbors. Thus, the ability of GNNs to capture long-range dependencies heavily depends on their depths. However, simply training deep GNNs has several bottleneck effects, e.g., over-fitting & over-smoothing, which may lead to unexpected results if GNNs are not well regularized.
    Here we present Graph Optimal Transport Networks (GOTNet) to capture long-range dependencies without increasing the depths of GNNs. Specifically, we perform k-Means clustering on nodes’ GNN embeddings to obtain graph-level representations (e.g., centroids). We then compute node-centroid attentions, which enable long-range messages to be communicated among distant but similar nodes. Our non-local attention operators work seamlessly with local operators in original GNNs. As such, GOTNet is able to capture both local and non-local messages in graphs by only using shallow GNNs, which avoids the bottleneck effects of deep GNNs. Experimental results demonstrate that GOTNet achieves better performance compared with state-of-the-art GNNs.

    References

    [1]
    Jason Altschuler, Jonathan Weed, and Philippe Rigollet. 2017. Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration. In NeurIPS. 1961–1971.
    [2]
    Guillermo D Canas and Lorenzo A Rosasco. 2012. Learning probability measures with respect to optimal transport metrics. In NeurIPS. 2492–2500.
    [3]
    Jiangxia Cao, Xixun Lin, Shu Guo, Luchen Liu, Tingwen Liu, and Bin Wang. 2021. Bipartite graph embedding via mutual information maximization. In WSDM. 635–643.
    [4]
    Huiyuan Chen and Jing Li. 2020. Neural Tensor Model for Learning Multi-Aspect Factors in Recommender Systems. In IJCAI.
    [5]
    Huiyuan Chen, Lan Wang, Yusan Lin, Chin-Chia Michael Yeh, Fei Wang, and Hao Yang. 2021. Structured graph convolutional networks with stochastic masks for recommender systems. In SIGIR. 614–623.
    [6]
    Liqun Chen, Zhe Gan, Yu Cheng, Linjie Li, Lawrence Carin, and Jingjing Liu. 2020. Graph optimal transport for cross-domain alignment. In ICML. 1542–1553.
    [7]
    Wei-Lin Chiang, Xuanqing Liu, Si Si, Yang Li, Samy Bengio, and Cho-Jui Hsieh. 2019. Cluster-GCN: An efficient algorithm for training deep and large graph convolutional networks. In KDD. 257–266.
    [8]
    Marco Cuturi. 2013. Sinkhorn distances: Lightspeed computation of optimal transport. NeurIPS, 2292–2300.
    [9]
    Maziar Moradi Fard, Thibaut Thonet, and Eric Gaussier. 2020. Deep k-means: Jointly clustering with k-means and learning representations. Pattern Recognition Letters(2020), 185–192.
    [10]
    Aude Genevay, Gabriel Dulac-Arnold, and Jean-Philippe Vert. 2019. Differentiable deep clustering with cluster size constraints. arXiv preprint arXiv:1910.09036(2019).
    [11]
    Aude Genevay, Gabriel Peyré, and Marco Cuturi. 2018. Learning generative models with sinkhorn divergences. In AISTATS. 1608–1617.
    [12]
    Songjie Gong. 2010. A collaborative filtering recommendation algorithm based on user clustering and item clustering.J. Softw. (2010), 745–752.
    [13]
    Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, and Xiuqiang He. 2017. DeepFM: a factorization-machine based neural network for CTR prediction. In IJCAI. 1725–1731.
    [14]
    Xiangnan He, Kuan Deng, Xiang Wang, Yan Li, YongDong Zhang, and Meng Wang. 2020. LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation. In SIGIR. 639–648.
    [15]
    Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural collaborative filtering. In WWW. 173–182.
    [16]
    Han Hu, Jiayuan Gu, Zheng Zhang, Jifeng Dai, and Yichen Wei. 2018. Relation networks for object detection. In CVPR. 3588–3597.
    [17]
    Tinglin Huang, Yuxiao Dong, Ming Ding, Zhen Yang, Wenzheng Feng, Xinyu Wang, and Jie Tang. 2021. MixGCF: An Improved Training Method for Graph Neural Network-Based Recommender Systems. In KDD. 665–674.
    [18]
    Jyun-Yu Jiang, Patrick H Chen, Cho-Jui Hsieh, and Wei Wang. 2020. Clustering and constructing user coresets to accelerate large-scale top-k recommender systems. In WWW. 2177–2187.
    [19]
    Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In ICLR.
    [20]
    Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix factorization techniques for recommender systems. Computer (2009), 30–37.
    [21]
    Charlotte Laclau, Ievgen Redko, Basarab Matei, Younes Bennani, and Vincent Brault. 2017. Co-clustering through optimal transport. In ICML. 2492–2500.
    [22]
    Guohao Li, Matthias Müller, Bernard Ghanem, and Vladlen Koltun. 2021. Training Graph Neural Networks with 1000 layers. In ICML. 6437–6449.
    [23]
    Qimai Li, Zhichao Han, and Xiao-Ming Wu. 2018. Deeper insights into graph convolutional networks for semi-supervised learning. In AAAI. 3538–3545.
    [24]
    Derek Lim, Xiuyu Li, Felix Hohne, and Ser-Nam Lim. 2021. New Benchmarks for Learning on Non-Homophilous Graphs. WWW Workshop on Graph Learning Benchmarks(2021).
    [25]
    Ding Liu, Bihan Wen, Yuchen Fan, Chen Change Loy, and Thomas S Huang. 2018. Non-Local Recurrent Network for Image Restoration. NeurIPS 31.
    [26]
    Fan Liu, Zhiyong Cheng, Lei Zhu, Zan Gao, and Liqiang Nie. 2021. Interest-aware Message-Passing GCN for Recommendation. In WWW. 1296–1305.
    [27]
    Meng Liu, Zhengyang Wang, and Shuiwang Ji. 2021. Non-Local Graph Neural Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence (2021).
    [28]
    Mikko I Malinen and Pasi Fränti. 2014. Balanced K-Means for Clustering. In Proceedings of the Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition. 32–41.
    [29]
    Jinsun Park, Kyungdon Joo, Zhe Hu, Chi-Kuei Liu, and In So Kweon. 2020. Non-local spatial propagation network for depth completion. In ECCV. 120–136.
    [30]
    Hongbin Pei, Bingzhe Wei, Kevin Chen-Chuan Chang, Yu Lei, and Bo Yang. 2020. Geom-GCN: Geometric Graph Convolutional Networks. In ICLR.
    [31]
    David Pollard. 1982. Quantization and the method of k-means. IEEE Transactions on Information theory(1982), 199–205.
    [32]
    Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. 2009. BPR: Bayesian personalized ranking from implicit feedback. In UAI. 452–461.
    [33]
    Yu Rong, Wenbing Huang, Tingyang Xu, and Junzhou Huang. 2019. Dropedge: Towards deep graph convolutional networks on node classification. In ICLR.
    [34]
    Tim Salimans, Han Zhang, Alec Radford, and Dimitris Metaxas. 2018. Improving GANs Using Optimal Transport. In ICLR.
    [35]
    Jianing Sun, Yingxue Zhang, Wei Guo, Huifeng Guo, Ruiming Tang, Xiuqiang He, Chen Ma, and Mark Coates. 2020. Neighbor interaction aware graph convolution networks for recommendation. In SIGIR. 1289–1298.
    [36]
    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In NeurIPS. 5998–6008.
    [37]
    Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2018. Graph Attention Networks. In ICLR.
    [38]
    Apoorv Vyas, Angelos Katharopoulos, and François Fleuret. 2020. Fast transformers with clustered attention. NeurIPS (2020), 21665–21674.
    [39]
    Xiaolong Wang, Ross Girshick, Abhinav Gupta, and Kaiming He. 2018. Non-local neural networks. In CVPR. 7794–7803.
    [40]
    Xiang Wang, Xiangnan He, Meng Wang, Fuli Feng, and Tat-Seng Chua. 2019. Neural graph collaborative filtering. In SIGIR. 165–174.
    [41]
    Yiwei Wang, Wei Wang, Yuxuan Liang, Yujun Cai, and Bryan Hooi. 2021. Mixup for Node and Graph Classification. In WWW. 3663–3674.
    [42]
    Zhengyang Wang, Na Zou, Dinggang Shen, and Shuiwang Ji. 2020. Non-local u-nets for biomedical image segmentation. In AAAI. 6315–6322.
    [43]
    Felix Wu, Amauri Souza, Tianyi Zhang, Christopher Fifty, Tao Yu, and Kilian Weinberger. 2019. Simplifying Graph Convolutional Networks. In ICML. 6861–6871.
    [44]
    Yujia Xie, Xiangfeng Wang, Ruijia Wang, and Hongyuan Zha. 2020. A fast proximal point method for computing exact wasserstein distance. In UAI. 433–453.
    [45]
    Bo Yang, Xiao Fu, Nicholas D Sidiropoulos, and Mingyi Hong. 2017. Towards k-means-friendly spaces: Simultaneous deep learning and clustering. In ICML. 3861–3870.
    [46]
    Rex Ying, Ruining He, Kaifeng Chen, Pong Eksombatchai, William L Hamilton, and Jure Leskovec. 2018. Graph convolutional neural networks for web-scale recommender systems. In KDD. 974–983.
    [47]
    Asano YM., Rupprecht C., and Vedaldi A.2020. Self-labelling via simultaneous clustering and representation learning. In ICLR.
    [48]
    Hongyi Zhang, Moustapha Cisse, Yann N Dauphin, and David Lopez-Paz. 2018. mixup: Beyond Empirical Risk Minimization. In ICLR.
    [49]
    Yulun Zhang, Kunpeng Li, Kai Li, Bineng Zhong, and Yun Fu. 2019. Residual Non-local Attention Networks for Image Restoration. In ICLR.
    [50]
    Lingxiao Zhao and Leman Akoglu. 2020. PairNorm: Tackling Oversmoothing in GNNs. In ICLR.
    [51]
    Jiong Zhu, Yujun Yan, Lingxiao Zhao, Mark Heimann, Leman Akoglu, and Danai Koutra. 2020. Beyond Homophily in Graph Neural Networks: Current Limitations and Effective Designs. NeurIPS, 7793–7804.
    [52]
    Zhen Zhu, Mengde Xu, Song Bai, Tengteng Huang, and Xiang Bai. 2019. Asymmetric non-local neural networks for semantic segmentation. In CVPR. 593–602.

    Cited By

    View all
    • (2024)ANAGL: A Noise-resistant and Anti-sparse Graph Learning for micro-video recommendationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3670407Online publication date: 3-Jun-2024
    • (2024)Masked Graph Transformer for Large-Scale RecommendationProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657971(2502-2506)Online publication date: 10-Jul-2024
    • (2024)Towards Mitigating Dimensional Collapse of Representations in Collaborative FilteringProceedings of the 17th ACM International Conference on Web Search and Data Mining10.1145/3616855.3635832(106-115)Online publication date: 4-Mar-2024
    • Show More Cited By

    Index Terms

    1. Graph Neural Transport Networks with Non-local Attentions for Recommender Systems
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Information & Contributors

            Information

            Published In

            cover image ACM Conferences
            WWW '22: Proceedings of the ACM Web Conference 2022
            April 2022
            3764 pages
            ISBN:9781450390965
            DOI:10.1145/3485447
            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Sponsors

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            Published: 25 April 2022

            Permissions

            Request permissions for this article.

            Check for updates

            Author Tags

            1. Graph neural Networks
            2. Non-local Attentions
            3. Optimal Transport

            Qualifiers

            • Research-article
            • Research
            • Refereed limited

            Conference

            WWW '22
            Sponsor:
            WWW '22: The ACM Web Conference 2022
            April 25 - 29, 2022
            Virtual Event, Lyon, France

            Acceptance Rates

            Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • Downloads (Last 12 months)137
            • Downloads (Last 6 weeks)10
            Reflects downloads up to 27 Jul 2024

            Other Metrics

            Citations

            Cited By

            View all
            • (2024)ANAGL: A Noise-resistant and Anti-sparse Graph Learning for micro-video recommendationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3670407Online publication date: 3-Jun-2024
            • (2024)Masked Graph Transformer for Large-Scale RecommendationProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657971(2502-2506)Online publication date: 10-Jul-2024
            • (2024)Towards Mitigating Dimensional Collapse of Representations in Collaborative FilteringProceedings of the 17th ACM International Conference on Web Search and Data Mining10.1145/3616855.3635832(106-115)Online publication date: 4-Mar-2024
            • (2024)Can One Embedding Fit All? A Multi-Interest Learning Paradigm Towards Improving User Interest Diversity FairnessProceedings of the ACM on Web Conference 202410.1145/3589334.3645662(1237-1248)Online publication date: 13-May-2024
            • (2024)PaCEr: Network Embedding From Positional to StructuralProceedings of the ACM on Web Conference 202410.1145/3589334.3645516(2485-2496)Online publication date: 13-May-2024
            • (2023)Probabilistic masked attention networks for explainable sequential recommendationProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence10.24963/ijcai.2023/230(2068-2076)Online publication date: 19-Aug-2023
            • (2023)Enhancing Transformers without Self-supervised Learning: A Loss Landscape Perspective in Sequential RecommendationProceedings of the 17th ACM Conference on Recommender Systems10.1145/3604915.3608831(791-797)Online publication date: 14-Sep-2023
            • (2023)Hessian-aware Quantized Node Embeddings for RecommendationProceedings of the 17th ACM Conference on Recommender Systems10.1145/3604915.3608826(757-762)Online publication date: 14-Sep-2023
            • (2023)Domain Disentanglement with Interpolative Data Augmentation for Dual-Target Cross-Domain RecommendationProceedings of the 17th ACM Conference on Recommender Systems10.1145/3604915.3608802(515-527)Online publication date: 14-Sep-2023
            • (2023)Distribution-based Learnable Filters with Side Information for Sequential RecommendationProceedings of the 17th ACM Conference on Recommender Systems10.1145/3604915.3608782(78-88)Online publication date: 14-Sep-2023
            • Show More Cited By

            View Options

            Get Access

            Login options

            View options

            PDF

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader

            HTML Format

            View this article in HTML Format.

            HTML Format

            Media

            Figures

            Other

            Tables

            Share

            Share

            Share this Publication link

            Share on social media