Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Scapin: Scalable Graph Structure Perturbation by Augmented Influence Maximization

Published: 20 June 2023 Publication History

Abstract

Generating data perturbations to graphs has become a useful tool for analyzing the robustness of Graph Neural Networks (GNNs). However, existing model-driven methodologies can be prohibitively expensive to apply in large graphs, which hinders the understanding of GNN robustness at scale. In this paper, we present Scapin, a data-driven methodology that opens up a new perspective by connecting graph structure perturbation for GNNs with augmented influence maximization-to either facilitate desirable spreads or curtail undesirable ones by adding or deleting a small set of edges. This connection not only allows us to perform data perturbation on GNNs with computation scalability but also provides nice interpretations. To transform such connections into efficient perturbation approaches for the new GNN setting, Scapin introduces a novel edge influence model, decomposed influence maximization objectives, and a principled algorithm for edge addition by exploiting submodularity of the objectives. Empirical studies demonstrate that Scapin can give orders of magnitude improvement over state-of-art methods in terms of runtime and memory efficiency, with comparable or even better performance.

Supplemental Material

MP4 File
Presentation video for SIGMOD 2023
PDF File
Read me
ZIP File
Source Code

References

[1]
Aleksandar Bojchevski and Stephan Günnemann. 2018. Deep Gaussian Embedding of Graphs: Unsupervised Inductive Learning via Ranking. In ICLR.
[2]
Vineet Chaoji, Sayan Ranu, Rajeev Rastogi, and Rushi Bhatt. 2012. Recommendations to boost content spread in social networks. In WWW.
[3]
Lei Chen, Zhengdao Chen, and Joan Bruna. 2020a. On Graph Neural Networks versus Graph-Augmented MLPs. arXiv preprint arXiv:2010.15116 (2020).
[4]
Liang Chen, Jintang Li, Jiaying Peng, Tao Xie, Zengxu Cao, Kun Xu, Xiangnan He, Zibin Zheng, and Bingzhe Wu. 2020b. A Survey of Adversarial Learning on Graph. arXiv preprint arXiv:2003.05730 (2020).
[5]
Hanjun Dai, Hui Li, Tian Tian, Xin Huang, Lin Wang, Jun Zhu, and Le Song. 2018. Adversarial Attack on Graph Structured Data. ICML.
[6]
Negin Entezari, Saba A Al-Sayouri, Amirali Darvishzadeh, and Evangelos E Papalexakis. 2020. All you need is low (rank) defending against adversarial attacks on graphs. In Proceedings of the 13th International Conference on Web Search and Data Mining. 169--177.
[7]
Fabrizio Frasca, Emanuele Rossi, Davide Eynard, Benjamin Chamberlain, Michael Bronstein, and Federico Monti. 2020. SIGN: Scalable Inception Graph Neural Networks. In ICML 2020 Workshop on Graph Representation Learning and Beyond.
[8]
Simon Geisler, Tobias Schmidt, Hakan cSirin, Daniel Zügner, Aleksandar Bojchevski, and Stephan Günnemann. 2021. Robustness of Graph Neural Networks at Scale. In NeurIPS.
[9]
Qipeng Guo, Xipeng Qiu, Xiangyang Xue, and Zheng Zhang. 2021. Syntax-guided text generation via graph neural network. Science China Information Sciences, Vol. 64 (2021), 1--10.
[10]
Sibo Wang Zhewei Wei Guo, Qintian and Ming Chen. 2020. Influence Maximization Revisited: Efficient Reverse Reachable Set Generation with Bound Tightened. In SIGMOD.
[11]
William L Hamilton, Rex Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In NeuIPS. 1025--1035.
[12]
Weihua Hu, Matthias Fey, Marinka Zitnik, Yuxiao Dong, Hongyu Ren, Bowen Liu, Michele Catasta, and Jure Leskovec. 2020. Open graph benchmark: Datasets for machine learning on graphs. NeurIPS (2020), 22118--22133.
[13]
Keke Huang, Sibo Wang, Glenn Bevilacqua, Xiaokui Xiao, and Laks VS Lakshmanan. 2017. Revisiting the stop-and-stare algorithms for influence maximization. In VLDB.
[14]
Hussain Hussain, Tomislav Duricic, Elisabeth Lex, Denis Helic, Markus Strohmaier, and Roman Kern. 2021. Structack: Structure-Based Adversarial Attacks on Graph Neural Networks. In HT.
[15]
Yuezihan Jiang, Yu Cheng, Hanyu Zhao, Wentao Zhang, Xupeng Miao, Yu He, Liang Wang, Zhi Yang, and Bin Cui. 2022. Zoomer: Boosting retrieval on web-scale graphs by regions of interest. In 2022 IEEE 38th International Conference on Data Engineering (ICDE). IEEE, 2224--2236.
[16]
Wei Jin, Yaxing Li, Han Xu, Yiqi Wang, Shuiwang Ji, Charu Aggarwal, and Jiliang Tang. 2021. Adversarial Attacks and Defenses on Graphs. SIGKDD Explor. Newsl. (2021), 19--34.
[17]
David Kempe, Jon Kleinberg, and Éva Tardos. 2003. Maximizing the spread of influence through a social network. In KDD. 137--146.
[18]
Elias Boutros Khalil, Bistra Dilkina, and Le Song. 2014. Scalable diffusion-aware optimization of network topology. In KDD.
[19]
Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In ICLR.
[20]
Johannes Klicpera, Aleksandar Bojchevski, and Stephan Günnemann. 2018. Predict then propagate: Graph neural networks meet personalized pagerank. arXiv preprint arXiv:1810.05997 (2018).
[21]
Jure Leskovec, Andreas Krause, Carlos Guestrin, Christos Faloutsos, Jeanne VanBriesen, and Natalie Glance. 2007. Cost-effective Outbreak Detection in Networks. In KDD. 420--429.
[22]
Ao Li, Zhou Qin, Runshi Liu, Yiqun Yang, and Dong Li. 2019. Spam review detection with graph convolutional networks. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management. 2703--2711.
[23]
Jintang Li, Tao Xie, Chen Liang, Fenfang Xie, Xiangnan He, and Zibin Zheng. 2021b. Adversarial attack on large scale graph. IEEE Transactions on Knowledge and Data Engineering (2021).
[24]
Yang Li, Yu Shen, Wentao Zhang, Yuanwei Chen, Huaijun Jiang, Mingchao Liu, Jiawei Jiang, Jinyang Gao, Wentao Wu, Zhi Yang, et al. 2021a. Openbox: A generalized black-box optimization service. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 3209--3219.
[25]
Jizhou Luo, Song Xiao, Shouxu Jiang, Hong Gao, and Yinuo Xiao. 2022. ripple 2 vec: Node Embedding with Ripple Distance of Structures. Data Science and Engineering, Vol. 7, 2 (2022), 156--174.
[26]
Jiaqi Ma, Junwei Deng, and Qiaozhu Mei. 2021a. Adversarial Attack on Graph Neural Networks as An Influence Maximization Problem. arXiv preprint arXiv:2106.10785 (2021).
[27]
Xiaoxiao Ma, Jia Wu, Shan Xue, Jian Yang, Chuan Zhou, Quan Z Sheng, Hui Xiong, and Leman Akoglu. 2021b. A comprehensive survey on graph anomaly detection with deep learning. IEEE Transactions on Knowledge and Data Engineering (2021).
[28]
Yao Ma, Suhang Wang, Tyler Derr, Lingfei Wu, and Jiliang Tang. 2019. Adversarial Attack on Graph Neural Networks as An Influence Maximization Problem. arXiv preprint arXiv:1906.03750 (2019).
[29]
Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-sne. In Journal of Machine Learning Research, 9(Nov). 2579--2605.
[30]
Andrew Kachites McCallum, Kamal Nigam, Jason Rennie, and Kristie Seymore. 2000. Automating the construction of internet portals with machine learning. Information Retrieval, Vol. 3, 2 (2000), 127--163.
[31]
Xupeng Miao, Nezihe Merve Gurel, Wentao Zhang, Zhichao Han, Bo Li, Wei Min, Susie Xi Rao, Hansheng Ren, Yinan Shan, Yingxia Shao, et al. 2021. Degnn: Improving graph neural networks with graph decomposition. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 1223--1233.
[32]
Baharan Mirzasoleiman, Amin Karbasi, Rik Sarkar, and Andreas Krause. 2013. Distributed Submodular Maximization: Identifying Representative Elements in Massive Data. In NeurIPS.
[33]
Alan Mislove, Bimal Viswanath, Krishna P. Gummadi, and Peter Druschel. 2010. You are who you know: Inferring user profiles in Online Social Networks. In Proceedings of the 3rd ACM International Conference of Web Search and Data Mining (WSDM'10) (New York, NY).
[34]
Naoto Ohsaka. 2020. The Solution Distribution of Influence Maximization: A High-level Experimental Study on Three Algorithmic Approaches. In SIGMOD.
[35]
Sungmin Rhee, Seokjun Seo, and Sun Kim. 2018. Hybrid Approach of Relation Network and Localized Graph Convolutional Filtering for Breast Cancer Subtype Classification. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18.
[36]
Benedek Rozemberczki, Carl Allen, and Rik Sarkar. 2021. Multi-scale attributed node embedding. Journal of Complex Networks, Vol. 9, 2 (2021), cnab014.
[37]
Shazia Sadiq, Tamraparni Dasu, Xin Luna Dong, Juliana Freire, Ihab F Ilyas, Sebastian Link, Miller J Miller, Felix Naumann, Xiaofang Zhou, and Divesh Srivastava. 2018. Data quality: The role of empiricism. ACM SIGMOD Record, Vol. 46, 4 (2018), 35--43.
[38]
Sebastian Schelter, Tammo Rukat, and Felix Bießmann. 2020. Learning to validate the predictions of black box classifiers on unseen data. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 1289--1299.
[39]
Prithviraj Sen, Galileo Namata, Mustafa Bilgic, Lise Getoor, Brian Galligher, and Tina Eliassi-Rad. 2008. Collective classification in network data. AI magazine, Vol. 29, 3 (2008), 93--93.
[40]
Yingxia Shao, Hongzheng Li, Xizhi Gu, Hongbo Yin, Yawen Li, Xupeng Miao, Wentao Zhang, Bin Cui, and Lei Chen. 2022. Distributed Graph Neural Network Training: A Survey. arXiv preprint arXiv:2211.00216 (2022).
[41]
Oleksandr Shchur, Maximilian Mumme, Aleksandar Bojchevski, and Stephan Günnemann. 2018. Pitfalls of Graph Neural Network Evaluation. NeurIPS (2018).
[42]
Lichao Sun, Yingtong Dou, Carl Yang, Kai Zhang, Ji Wang, S Yu Philip, Lifang He, and Bo Li. 2022. Adversarial attack and defense on graph data: A survey. IEEE Transactions on Knowledge and Data Engineering (2022).
[43]
Youze Tang, Yanchen Shi, and Xiaokui Xiao. 2015. Influence maximization in near-linear time: A martingale approach. In SIGMOD.
[44]
Youze Tang, Xiaokui Xiao, and Yanchen Shi. 2014. Influence maximization: Near-optimal time complexity meets practical efficiency. In SIGMOD.
[45]
Petar Velivc kovi?, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2017. Graph attention networks. arXiv preprint arXiv:1710.10903 (2017).
[46]
Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2018. Graph Attention Networks. In ICLR.
[47]
Fei Wang and Changshui Zhang. 2007. Label propagation through linear neighborhoods. IEEE Transactions on Knowledge and Data Engineering, Vol. 20, 1 (2007), 55--67.
[48]
Marcin Waniek, Tomasz P Michalak, Michael J Wooldridge, and Talal Rahwan. 2018. Hiding individuals and communities in a social network. Nature Human Behaviour, Vol. 2, 2 (2018), 139--147.
[49]
Steven Euijong Whang and Jae-Gil Lee. 2020. Data collection and quality challenges for deep learning. Proceedings of the VLDB Endowment, Vol. 13, 12 (2020), 3429--3432.
[50]
Steven Euijong Whang, Yuji Roh, Hwanjun Song, and Jae-Gil Lee. 2023. Data collection and quality challenges in deep learning: A data-centric ai perspective. The VLDB Journal (2023), 1--23.
[51]
Felix Wu, Amauri Souza, Tianyi Zhang, Christopher Fifty, Tao Yu, and Kilian Weinberger. 2019. Simplifying graph convolutional networks. In ICML.
[52]
Haixia Wu, Chunyao Song, Yao Ge, and Tingjian Ge. 2022a. Link prediction on complex networks: an experimental survey. Data Science and Engineering, Vol. 7, 3 (2022), 253--278.
[53]
Shiwen Wu, Fei Sun, Wentao Zhang, Xu Xie, and Bin Cui. 2022b. Graph neural networks in recommender systems: a survey. Comput. Surveys, Vol. 55, 5 (2022), 1--37.
[54]
Shiwen Wu, Yuanxing Zhang, Chengliang Gao, Kaigui Bian, and Bin Cui. 2020. GARG: Anonymous Recommendation of Point-of-Interest in Mobile Networks by Graph Convolution Network. Data Science and Engineering, Vol. 5, 4 (2020), 433--447.
[55]
Xu-Gang Wu, Hui-Jun Wu, Xu Zhou, Xiang Zhao, and Kai Lu. 2022c. Towards Defense Against Adversarial Attacks on Graph Neural Networks via Calibrated Co-Training. Journal of Computer Science and Technology, Vol. 37, 5 (2022), 1161--1175.
[56]
Kaidi Xu, Hongge Chen, Sijia Liu, Pin-Yu Chen, Tsui-Wei Weng, Mingyi Hong, and Xue Lin. 2019. Topology Attack and Defense for Graph Neural Networks: An Optimization Perspective. In IJCAI.
[57]
Zhi Yang, Christo Wilson, Xiao Wang, Tingting Gao, Ben Y Zhao, and Yafei Dai. 2014. Uncovering social network sybils in the wild. ACM Transactions on Knowledge Discovery from Data (TKDD), Vol. 8, 1 (2014), 2. https://www.cs.ucsb.edu/ ravenben/publications/pdf/sybil-imc11.pdf
[58]
Hanqing Zeng, Hongkuan Zhou, Ajitesh Srivastava, Rajgopal Kannan, and Viktor K. Prasanna. 2020. GraphSAINT: Graph Sampling Based Inductive Learning Method. In ICLR.
[59]
Daochen Zha, Zaid Pervaiz Bhat, Kwei-Herng Lai, Fan Yang, and Xia Hu. 2023. Data-centric AI: Perspectives and Challenges. arXiv preprint arXiv:2301.04819 (2023).
[60]
Peng Zhan, Yupeng Hu, Lin Chen, Wei Luo, and Xueqing Li. 2021. Spar: Set-based piecewise aggregate representation for time series anomaly detection. Science China Information Sciences, Vol. 64 (2021), 1--3.
[61]
Sixiao Zhang, Hongxu Chen, Xiangguo Sun, Yicong Li, and Guandong Xu. 2022a. Unsupervised Graph Poisoning Attack via Contrastive Loss Back-propagation. In WWW.
[62]
Wentao Zhang, Xupeng Miao, Yingxia Shao, Jiawei Jiang, Lei Chen, Olivier Ruas, and Bin Cui. 2020. Reliable data distillation on graph convolutional network. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 1399--1414.
[63]
Wentao Zhang, Yu Shen, Zheyu Lin, Yang Li, Xiaosen Li, Wen Ouyang, Yangyu Tao, Zhi Yang, and Bin Cui. 2022b. Pasca: A graph neural architecture search system under the scalable paradigm. In Proceedings of the ACM Web Conference 2022. 1817--1828.
[64]
Wentao Zhang, Yexin Wang, Zhenbang You, Meng Cao, Ping Huang, Jiulong Shan, Zhi Yang, and Bin Cui. 2021b. Rim: Reliable influence-based active learning on graphs. Advances in Neural Information Processing Systems, Vol. 34 (2021), 27978--27990.
[65]
Wentao Zhang, Mingyu Yang, Zeang Sheng, Yang Li, Wen Ouyang, Yangyu Tao, Zhi Yang, and Bin Cui. 2021c. Node dependent local smoothing for scalable graph learning. Advances in Neural Information Processing Systems, Vol. 34 (2021), 20321--20332.
[66]
Wentao Zhang, Zhi Yang, Yexin Wang, Yu Shen, Yang Li, Liang Wang, and Bin Cui. 2021d. GRAIN: improving data efficiency of gra ph neural networks via diversified in fluence maximization. Proceedings of the VLDB Endowment, Vol. 14, 11 (2021), 2473--2482.
[67]
Wentao Zhang, Ziqi Yin, Zeang Sheng, Yang Li, Wen Ouyang, Xiaosen Li, Yangyu Tao, Zhi Yang, and Bin Cui. 2022c. Graph attention multi-layer perceptron. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 4560--4570.
[68]
Xiang Zhang and Marinka Zitnik. 2020. GNNGuard: Defending Graph Neural Networks against Adversarial Attacks. In NeurIPS.
[69]
Xiao-Meng Zhang, Li Liang, Lin Liu, and Ming-Jing Tang. 2021a. Graph neural networks and their current applications in bioinformatics. Frontiers in genetics, Vol. 12 (2021).
[70]
Yao Zhang, Abhijin Adiga, Sudip Saha, Anil Vullikanti, and B. Aditya Prakash. 2016. Near-Optimal Algorithms for Controlling Propagation at Group Scale on Networks. IEEE Trans. Knowl. Data Eng., Vol. 28, 12 (2016), 3339--3352.
[71]
Dan-Hao Zhu, Xin-Yu Dai, and Jia-Jun Chen. 2021. Pre-train and learn: Preserving global information for graph neural networks. Journal of Computer Science and Technology, Vol. 36 (2021), 1420--1430.
[72]
Zihan Liu, Yun Luo, Zelin Zang, and Stan Z. Li. 2022. Surrogate Representation Learning with Isometric Mapping for Gray-box Graph Adversarial Attacks. In WSDM. ACM.
[73]
Daniel Zügner, Amir Akbarnejad, and Stephan Günnemann. 2018. Adversarial Attacks on Neural Networks for Graph Data. In SIGKDD. 2847--2856.
[74]
Daniel Zügner and Stephan Günnemann. 2019. Adversarial Attacks on Graph Neural Networks via Meta Learning. In ICLR.

Cited By

View all
  • (2024)DIDS: Double Indices and Double Summarizations for Fast Similarity SearchProceedings of the VLDB Endowment10.14778/3665844.366585117:9(2198-2211)Online publication date: 1-May-2024
  • (2024)CIVET: Exploring Compact Index for Variable-Length Subsequence Matching on Time SeriesProceedings of the VLDB Endowment10.14778/3665844.366584517:9(2123-2135)Online publication date: 1-May-2024
  • (2024)Visualization-Aware Time Series Min-Max Caching with Error Bound GuaranteesProceedings of the VLDB Endowment10.14778/3659437.365946017:8(2091-2103)Online publication date: 31-May-2024
  • Show More Cited By

Index Terms

  1. Scapin: Scalable Graph Structure Perturbation by Augmented Influence Maximization

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image Proceedings of the ACM on Management of Data
      Proceedings of the ACM on Management of Data  Volume 1, Issue 2
      PACMMOD
      June 2023
      2310 pages
      EISSN:2836-6573
      DOI:10.1145/3605748
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 20 June 2023
      Published in PACMMOD Volume 1, Issue 2

      Permissions

      Request permissions for this article.

      Badges

      Author Tags

      1. augmented influence maximization
      2. data perturbation
      3. graph neural network

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)155
      • Downloads (Last 6 weeks)10
      Reflects downloads up to 15 Oct 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)DIDS: Double Indices and Double Summarizations for Fast Similarity SearchProceedings of the VLDB Endowment10.14778/3665844.366585117:9(2198-2211)Online publication date: 1-May-2024
      • (2024)CIVET: Exploring Compact Index for Variable-Length Subsequence Matching on Time SeriesProceedings of the VLDB Endowment10.14778/3665844.366584517:9(2123-2135)Online publication date: 1-May-2024
      • (2024)Visualization-Aware Time Series Min-Max Caching with Error Bound GuaranteesProceedings of the VLDB Endowment10.14778/3659437.365946017:8(2091-2103)Online publication date: 31-May-2024
      • (2024)Performance-Based Pricing for Federated Learning via AuctionProceedings of the VLDB Endowment10.14778/3648160.364816917:6(1269-1282)Online publication date: 3-May-2024
      • (2024)Hybrid Prompt Learning for Generating Justifications of Security Risks in Automation RulesACM Transactions on Intelligent Systems and Technology10.1145/3675401Online publication date: 29-Jun-2024
      • (2024)Databases in Edge and Fog Environments: A SurveyACM Computing Surveys10.1145/366600156:11(1-40)Online publication date: 8-Jul-2024
      • (2024)RaBitQ: Quantizing High-Dimensional Vectors with a Theoretical Error Bound for Approximate Nearest Neighbor SearchProceedings of the ACM on Management of Data10.1145/36549702:3(1-27)Online publication date: 30-May-2024
      • (2024)Convolution and Cross-Correlation of Count Sketches Enables Fast Cardinality Estimation of Multi-Join QueriesProceedings of the ACM on Management of Data10.1145/36549322:3(1-26)Online publication date: 30-May-2024
      • (2024)Time Series Representation for Visualization in Apache IoTDBProceedings of the ACM on Management of Data10.1145/36392902:1(1-26)Online publication date: 26-Mar-2024
      • (2024)NPA: Improving Large-scale Graph Neural Networks with Non-parametric AttentionCompanion of the 2024 International Conference on Management of Data10.1145/3626246.3653399(414-427)Online publication date: 9-Jun-2024
      • Show More Cited By

      View Options

      Get Access

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media