research-article

Personalized PageRank on Evolving Graphs with an Incremental Index-Update Scheme

Authors:

Fangyuan Zhang,

Zhewei WeiAuthors Info & Claims

Proceedings of the ACM on Management of Data, Volume 1, Issue 1

Article No.: 25, Pages 1 - 26

https://doi.org/10.1145/3588705

Published: 30 May 2023 Publication History

Abstract

\em Personalized PageRank (PPR) stands as a fundamental proximity measure in graph mining. Given an input graph G with the probability of decay α, a source node s and a target node t, the PPR score π(s,t) of target t with respect to source s is the probability that an α-decay random walk starting from s stops at t. A \em single-source PPR (SSPPR) query takes an input graph G with decay probability α and a source s, and then returns the PPR π(s,v) for each node v ∈ V. Since computing an exact SSPPR query answer is prohibitive, most existing solutions turn to approximate queries with guarantees. The state-of-the-art solutions for approximate SSPPR queries are index-based and mainly focus on static graphs, while real-world graphs are usually dynamically changing. However, existing index-update schemes can not achieve a sub-linear update time. Motivated by this, we present an efficient indexing scheme for single-source PPR queries on evolving graphs. Our proposed solution is based on a classic framework that combines the forward-push technique with a random walk index for approximate PPR queries. Thus, our indexing scheme is similar to existing solutions in the sense that we store pre-sampled random walks for efficient query processing. One of our main contributions is an incremental updating scheme to maintain indexed random walks in expected O(1) time after each graph update. To achieve O(1) update cost, we need to maintain auxiliary data structures for both vertices and edges. To reduce the space consumption, we further revisit the sampling methods and propose a new sampling scheme to remove the auxiliary data structure for vertices while still supporting O(1) index update cost on evolving graphs. Extensive experiments show that our update scheme achieves orders of magnitude speed-up on update performance over existing index-based dynamic schemes without sacrificing the query efficiency.

Supplemental Material

MP4 File

SIGMOD-FIRM-Presentation-Video

Download
21.05 MB

References

[1]

2013. KONECT. http://konect.cc/networks/.

[2]

2014. SNAP Datasets. http://snap.stanford.edu/data.

[3]

2022. Technical Report. https://arxiv.org/abs/2212.10288.

[4]

2023. Source Code. https://github.com/lalumine/firm.

[5]

Reid Andersen, Christian Borgs, Jennifer T. Chayes, John E. Hopcroft, Vahab S. Mirrokni, and Shang-Hua Teng. 2007. Local Computation of PageRank Contributions. In WAW. 150--165.

[6]

Reid Andersen, Fan R. K. Chung, and Kevin J. Lang. 2006. Local Graph Partitioning using PageRank Vectors. In FOCS. 475--486.

[7]

Bahman Bahmani, Abdur Chowdhury, and Ashish Goel. 2010. Fast Incremental and Personalized PageRank. Proc. VLDB Endow. 4, 3 (2010), 173--184.

Digital Library

[8]

Aleksandar Bojchevski, Johannes Klicpera, Bryan Perozzi, Amol Kapoor, Martin Blais, Benedek Rózemberczki, Michal Lukasik, and Stephan Günnemann. 2020. Scaling Graph Neural Networks with Approximate PageRank. In SIGKDD. 2464--2473.

[9]

Fan Chung and Linyuan Lu. 2006. Concentration Inequalities and Martingale Inequalities: A Survey. Internet Mathematics 3, 1 (2006), 79--127.

[10]

Xinyu Du, Xingyi Zhang, Sibo Wang, and Zengfeng Huang. 2023. Efficient Tree-SVD for Subset Node Embedding over Large Dynamic Graphs. PACMMOD 1, 1 (2023), 96:1--96:26.

[11]

Dániel Fogaras, Balázs Rácz, Károly Csalogány, and Tamás Sarlós. 2005. Towards scaling fully personalized pagerank: Algorithms, lower bounds, and experiments. Internet Mathematics 2, 3 (2005), 333--358.

[12]

Dongqi Fu and Jingrui He. 2021. SDG: A Simplified and Dynamic Graph Neural Network. In SIGIR. 2273--2277.

[13]

Yasuhiro Fujiwara, Makoto Nakatsuji, Hiroaki Shiokawa, Takeshi Mishima, and Makoto Onizuka. 2013. Efficient ad-hoc search for personalized PageRank. In SIGMOD. 445--456.

[14]

Yasuhiro Fujiwara, Makoto Nakatsuji, Takeshi Yamamuro, Hiroaki Shiokawa, and Makoto Onizuka. 2012. Efficient personalized pagerank with accuracy assurance. In KDD. 15--23.

[15]

Tao Guo, Xin Cao, Gao Cong, Jiaheng Lu, and Xuemin Lin. 2017. Distributed Algorithms on Exact Personalized PageRank. In SIGMOD. 479--494.

[16]

Wentian Guo, Yuchen Li, Mo Sha, and Kian-Lee Tan. 2017. Parallel Personalized Pagerank on Dynamic Graphs. PVLDB 11, 1 (2017), 93--106.

Digital Library

[17]

Xingzhi Guo, Baojian Zhou, and Steven Skiena. 2021. Subset Node Representation Learning over Large Dynamic Graphs. In KDD. ACM, 516--526.

[18]

Pankaj Gupta, Ashish Goel, Jimmy Lin, Aneesh Sharma, Dong Wang, and Reza Zadeh. 2013. WTF: The who to follow service at twitter. In WWW. 505--514.

Digital Library

[19]

Zoltán Gyöngyi, Pavel Berkhin, Hector Garcia-Molina, and Jan O. Pedersen. 2006. Link Spam Detection Based on Mass Estimation. In VLDB. 439--450.

Digital Library

[20]

Guanhao Hou, Xingguang Chen, Sibo Wang, and Zhewei Wei. 2021. Massively Parallel Algorithms for Personalized PageRank. PVLDB 14, 9 (2021), 1668--1680.

Digital Library

[21]

Glen Jeh and Jennifer Widom. 2003. Scaling personalized web search. In WWW. 271--279.

[22]

Jinhong Jung, Namyong Park, Lee Sael, and U Kang. 2017. BePI: Fast and Memory-Efficient Method for Billion-Scale Random Walk with Restart. In SIGMOD. 789--804.

[23]

Dandan Lin, Raymond Chi-Wing Wong, Min Xie, and Victor Junqiu Wei. 2020. Index-Free Approach with Theoretical Guarantee for Efficient Random Walk with Restart Query. In ICDE. 913--924.

[24]

Wenqing Lin. 2019. Distributed Algorithms for Fully Personalized PageRank on Large Graphs. In WWW. 1084--1094.

[25]

Peter Lofgren, Siddhartha Banerjee, and Ashish Goel. 2015. Bidirectional PageRank Estimation: From Average-Case to Worst-Case. In WAW 2015. 164--176.

Digital Library

[26]

Peter Lofgren, Siddhartha Banerjee, Ashish Goel, and Comandur Seshadhri. 2014. Fast-ppr: Scaling personalized pagerank estimation for large graphs. In KDD. 1436--1445.

Digital Library

[27]

Siqiang Luo. 2019. Distributed PageRank Computation: An Improved Theoretical Study. In AAAI. 4496--4503.

[28]

Takanori Maehara, Takuya Akiba, Yoichi Iwata, and Ken ichi Kawarabayashi. 2014. Computing personalized PageRank quickly by exploiting graph structures. PVLDB 7, 12 (2014), 1023--1034.

Digital Library

[29]

Dingheng Mo and Siqiang Luo. 2021. Agenda: Robust Personalized PageRanks in Evolving Graphs. In CIKM. 1315--1324.

Digital Library

[30]

Naoto Ohsaka, Takanori Maehara, and Ken-ichi Kawarabayashi. 2015. Efficient PageRank Tracking in Evolving Networks. In SIGKDD. 875--884.

[31]

Mingdong Ou, Peng Cui, Jian Pei, Ziwei Zhang, and Wenwu Zhu. 2016. Asymmetric Transitivity Preserving Graph Embedding. In KDD. 1105--1114.

[32]

Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1999. The PageRank citation ranking: bringing order to the web. Technical Report. Stanford InfoLab.

[33]

Atish Das Sarma, Anisur Rahaman Molla, Gopal Pandurangan, and Eli Upfal. 2013. Fast Distributed PageRank Computation. In ICDCN. 11--26.

[34]

Jieming Shi, Renchi Yang, Tianyuan Jin, Xiaokui Xiao, and Yin Yang. 2019. Realtime Top-k Personalized PageRank over Large Graphs on GPUs. PVLDB 13, 1 (2019), 15--28.

Digital Library

[35]

Kijung Shin, Jinhong Jung, Lee Sael, and U. Kang. 2015. BEAR: Block Elimination Approach for Random Walk with Restart on Large Graphs. In SIGMOD. 1571--1585.

Digital Library

[36]

Anton Tsitsulin, Davide Mottin, Panagiotis Karras, and Emmanuel Müller. 2018. VERSE: Versatile Graph Embeddings from Similarity Measures. In WWW. 539--548.

Digital Library

[37]

Hanzhi Wang, Zhewei Wei, Junhao Gan, Sibo Wang, and Zengfeng Huang. 2020. Personalized PageRank to a Target Node, Revisited. In SIGKDD. 657--667.

[38]

Runhui Wang, Sibo Wang, and Xiaofang Zhou. 2019. Parallelizing approximate single-source personalized PageRank queries on shared memory. VLDB J. 28, 6 (2019), 923--940.

[39]

Sibo Wang, Youze Tang, Xiaokui Xiao, Yin Yang, and Zengxiang Li. 2016. HubPPR: Effective Indexing for Approximate Personalized PageRank. PVLDB 10, 3 (2016), 205--216.

Digital Library

[40]

Sibo Wang and Yufei Tao. 2018. Efficient Algorithms for Finding Approximate Heavy Hitters in Personalized PageRanks. In SIGMOD. 1113--1127.

[41]

Sibo Wang, Renchi Yang, Runhui Wang, Xiaokui Xiao, Zhewei Wei, Wenqing Lin, Yin Yang, and Nan Tang. 2019. Efficient Algorithms for Approximate Single-Source Personalized PageRank Queries. TODS 44, 4 (2019), 18:1--18:37.

[42]

Sibo Wang, Renchi Yang, Xiaokui Xiao, Zhewei Wei, and Yin Yang. 2017. FORA: Simple and Effective Approximate Single-Source Personalized PageRank. In SIGKDD. 505--514.

Digital Library

[43]

Zhewei Wei, Xiaodong He, Xiaokui Xiao, Sibo Wang, Shuo Shang, and Ji-Rong Wen. 2018. TopPPR: Top-k Personalized PageRank Queries with Precision Guarantees on Large Graphs. In SIGMOD. 441--456.

[44]

Hao Wu, Junhao Gan, Zhewei Wei, and Rui Zhang. 2021. Unifying the Global and Local Approaches: An Efficient Power Iteration with Forward Push. In SIGMOD. 1996--2008.

[45]

Renchi Yang, Jieming Shi, Xiaokui Xiao, Yin Yang, and Sourav S. Bhowmick. 2020. Homogeneous Network Embedding for Massive Graphs via Reweighted Personalized PageRank. Proc. VLDB Endow. 13, 5 (2020), 670--683.

Digital Library

[46]

Yuan Yin and Zhewei Wei. 2019. Scalable Graph Embeddings via Sparse Transpose Proximities. In KDD. 1429--1437.

[47]

Hongyang Zhang, Peter Lofgren, and Ashish Goel. 2016. Approximate Personalized PageRank on Dynamic Graphs. In KDD. 1315--1324.

[48]

Xingyi Zhang, Kun Xie, Sibo Wang, and Zengfeng Huang. 2021. Learning Based Proximity Matrix Factorization for Node Embedding. In KDD. 2243--2253.

[49]

Fanwei Zhu, Yuan Fang, Kevin Chen-Chuan Chang, and Jing Ying. 2013. Incremental and Accuracy-Aware Personalized PageRank through Scheduled Approximation. PVLDB 6, 6 (2013), 481--492.

Digital Library

Cited By

Liao MLi CLi RWang G(2025)Efficient Index Maintenance for Effective Resistance Computation on Evolving GraphsProceedings of the ACM on Management of Data10.1145/37096863:1(1-27)Online publication date: 11-Feb-2025
https://dl.acm.org/doi/10.1145/3709686
Liu HLuo S(2024)BIRD: Efficient Approximation of Bidirectional Hidden Personalized PageRankProceedings of the VLDB Endowment10.14778/3665844.366585517:9(2255-2268)Online publication date: 6-Aug-2024
https://dl.acm.org/doi/10.14778/3665844.3665855
Xia HZhang Z(2024)Efficient Approximation of Kemeny's Constant for Large GraphsProceedings of the ACM on Management of Data10.1145/36549372:3(1-26)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.1145/3654937
Show More Cited By

Index Terms

Personalized PageRank on Evolving Graphs with an Incremental Index-Update Scheme
1. Theory of computation
  1. Design and analysis of algorithms
    1. Graph algorithms analysis

Recommendations

Approximate Personalized PageRank on Dynamic Graphs
KDD '16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

We propose and analyze two algorithms for maintaining approximate Personalized PageRank (PPR) vectors on a dynamic graph, where edges are added or deleted. Our algorithms are natural dynamic versions of two known local variations of power iteration. One,...
Fast Nearest Neighbor Search on Large Time-Evolving Graphs
Machine Learning and Knowledge Discovery in Databases
Abstract
Finding the k nearest neighbors (k-nns) of a given vertex in a graph has many applications such as link prediction, keyword search, and image tagging. An established measure of vertex-proximity in graphs is the Personalized Page Rank (ppr) score ...
Fast nearest neighbor search on large time-evolving graphs
ECMLPKDD'14: Proceedings of the 2014th European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I

Finding the k nearest neighbors (k-NNS) of a given vertex in a graph has many applications such as link prediction, keyword search, and image tagging. An established measure of vertex-proximity in graphs is the Personalized Page Rank (PPR) score based ...

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Management of Data

Proceedings of the ACM on Management of Data Volume 1, Issue 1

PACMMOD

May 2023

2807 pages

EISSN:2836-6573

DOI:10.1145/3603164

Editor:
Divyakant Agrawal
UC Santa Barbara, United States

Issue’s Table of Contents

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 May 2023

Published in PACMMOD Volume 1, Issue 1

Permissions

Request permissions for this article.

Request Permissions

Author Tags

Qualifiers

Research-article

Funding Sources

Hong Kong RGC CRF Grant
CCF-Baidu Open Fund
Beijing Natural Science Foundation
Hong Kong ITC ITF Grant
National Natural Science Foundation of China
Hong Kong RGC GRF Grant
Hong Kong RGC ECS Grant

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

13
Total Citations
View Citations
308
Total Downloads

Downloads (Last 12 months)115
Downloads (Last 6 weeks)7

Reflects downloads up to 25 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Liao MLi CLi RWang G(2025)Efficient Index Maintenance for Effective Resistance Computation on Evolving GraphsProceedings of the ACM on Management of Data10.1145/37096863:1(1-27)Online publication date: 11-Feb-2025
https://dl.acm.org/doi/10.1145/3709686
Liu HLuo S(2024)BIRD: Efficient Approximation of Bidirectional Hidden Personalized PageRankProceedings of the VLDB Endowment10.14778/3665844.366585517:9(2255-2268)Online publication date: 6-Aug-2024
https://dl.acm.org/doi/10.14778/3665844.3665855
Xia HZhang Z(2024)Efficient Approximation of Kemeny's Constant for Large GraphsProceedings of the ACM on Management of Data10.1145/36549372:3(1-26)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.1145/3654937
Cao ZLi JWang ZLi JBaeza-Yates RBonchi F(2024)DiffusionE: Reasoning on Knowledge Graphs via Diffusion-based Graph Neural NetworksProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671997(222-230)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671997
Xia HZhang ZBaeza-Yates RBonchi F(2024)Fast Computation of Kemeny's Constant for Directed GraphsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671859(3472-3483)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671859
Zhang XWeng ZWang SChua TNgo CKa-Wei Lee RKumar RLauw H(2024)Towards Deeper Understanding of PPR-based Embedding Approaches: A Topological PerspectiveProceedings of the ACM Web Conference 202410.1145/3589334.3645663(969-979)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3645663
Zhang QLee HMa JLou JYang CXiong LChua TNgo CKa-Wei Lee RKumar RLauw H(2024)DPAR: Decoupled Graph Neural Networks with Node-Level Differential PrivacyProceedings of the ACM Web Conference 202410.1145/3589334.3645531(1170-1181)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3645531
Yang MWang HWei ZWang SWen J(2024)Efficient Algorithms for Personalized PageRank Computation: A SurveyIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.337600036:9(4582-4602)Online publication date: 12-Mar-2024
https://dl.acm.org/doi/10.1109/TKDE.2024.3376000
Yamashita TKaneko K(2024)Fast Personalized PageRank for Customized Analysis Range Using Static Index2024 Fifteenth International Conference on Ubiquitous and Future Networks (ICUFN)10.1109/ICUFN61752.2024.10625115(304-309)Online publication date: 2-Jul-2024
https://doi.org/10.1109/ICUFN61752.2024.10625115
Zhang FJiang MWang S(2023)Efficient Dynamic Weighted Set Sampling and Its ExtensionProceedings of the VLDB Endowment10.14778/3617838.361784017:1(15-27)Online publication date: 1-Sep-2023
https://dl.acm.org/doi/10.14778/3617838.3617840
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents