Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Personalized PageRank on Evolving Graphs with an Incremental Index-Update Scheme

Published: 30 May 2023 Publication History

Abstract

\em Personalized PageRank (PPR) stands as a fundamental proximity measure in graph mining. Given an input graph G with the probability of decay α, a source node s and a target node t, the PPR score π(s,t) of target t with respect to source s is the probability that an α-decay random walk starting from s stops at t. A \em single-source PPR (SSPPR) query takes an input graph G with decay probability α and a source s, and then returns the PPR π(s,v) for each node v ∈ V. Since computing an exact SSPPR query answer is prohibitive, most existing solutions turn to approximate queries with guarantees. The state-of-the-art solutions for approximate SSPPR queries are index-based and mainly focus on static graphs, while real-world graphs are usually dynamically changing. However, existing index-update schemes can not achieve a sub-linear update time. Motivated by this, we present an efficient indexing scheme for single-source PPR queries on evolving graphs. Our proposed solution is based on a classic framework that combines the forward-push technique with a random walk index for approximate PPR queries. Thus, our indexing scheme is similar to existing solutions in the sense that we store pre-sampled random walks for efficient query processing. One of our main contributions is an incremental updating scheme to maintain indexed random walks in expected O(1) time after each graph update. To achieve O(1) update cost, we need to maintain auxiliary data structures for both vertices and edges. To reduce the space consumption, we further revisit the sampling methods and propose a new sampling scheme to remove the auxiliary data structure for vertices while still supporting O(1) index update cost on evolving graphs. Extensive experiments show that our update scheme achieves orders of magnitude speed-up on update performance over existing index-based dynamic schemes without sacrificing the query efficiency.

Supplemental Material

MP4 File
SIGMOD-FIRM-Presentation-Video

References

[1]
2013. KONECT. http://konect.cc/networks/.
[2]
2014. SNAP Datasets. http://snap.stanford.edu/data.
[3]
2022. Technical Report. https://arxiv.org/abs/2212.10288.
[4]
2023. Source Code. https://github.com/lalumine/firm.
[5]
Reid Andersen, Christian Borgs, Jennifer T. Chayes, John E. Hopcroft, Vahab S. Mirrokni, and Shang-Hua Teng. 2007. Local Computation of PageRank Contributions. In WAW. 150--165.
[6]
Reid Andersen, Fan R. K. Chung, and Kevin J. Lang. 2006. Local Graph Partitioning using PageRank Vectors. In FOCS. 475--486.
[7]
Bahman Bahmani, Abdur Chowdhury, and Ashish Goel. 2010. Fast Incremental and Personalized PageRank. Proc. VLDB Endow. 4, 3 (2010), 173--184.
[8]
Aleksandar Bojchevski, Johannes Klicpera, Bryan Perozzi, Amol Kapoor, Martin Blais, Benedek Rózemberczki, Michal Lukasik, and Stephan Günnemann. 2020. Scaling Graph Neural Networks with Approximate PageRank. In SIGKDD. 2464--2473.
[9]
Fan Chung and Linyuan Lu. 2006. Concentration Inequalities and Martingale Inequalities: A Survey. Internet Mathematics 3, 1 (2006), 79--127.
[10]
Xinyu Du, Xingyi Zhang, Sibo Wang, and Zengfeng Huang. 2023. Efficient Tree-SVD for Subset Node Embedding over Large Dynamic Graphs. PACMMOD 1, 1 (2023), 96:1--96:26.
[11]
Dániel Fogaras, Balázs Rácz, Károly Csalogány, and Tamás Sarlós. 2005. Towards scaling fully personalized pagerank: Algorithms, lower bounds, and experiments. Internet Mathematics 2, 3 (2005), 333--358.
[12]
Dongqi Fu and Jingrui He. 2021. SDG: A Simplified and Dynamic Graph Neural Network. In SIGIR. 2273--2277.
[13]
Yasuhiro Fujiwara, Makoto Nakatsuji, Hiroaki Shiokawa, Takeshi Mishima, and Makoto Onizuka. 2013. Efficient ad-hoc search for personalized PageRank. In SIGMOD. 445--456.
[14]
Yasuhiro Fujiwara, Makoto Nakatsuji, Takeshi Yamamuro, Hiroaki Shiokawa, and Makoto Onizuka. 2012. Efficient personalized pagerank with accuracy assurance. In KDD. 15--23.
[15]
Tao Guo, Xin Cao, Gao Cong, Jiaheng Lu, and Xuemin Lin. 2017. Distributed Algorithms on Exact Personalized PageRank. In SIGMOD. 479--494.
[16]
Wentian Guo, Yuchen Li, Mo Sha, and Kian-Lee Tan. 2017. Parallel Personalized Pagerank on Dynamic Graphs. PVLDB 11, 1 (2017), 93--106.
[17]
Xingzhi Guo, Baojian Zhou, and Steven Skiena. 2021. Subset Node Representation Learning over Large Dynamic Graphs. In KDD. ACM, 516--526.
[18]
Pankaj Gupta, Ashish Goel, Jimmy Lin, Aneesh Sharma, Dong Wang, and Reza Zadeh. 2013. WTF: The who to follow service at twitter. In WWW. 505--514.
[19]
Zoltán Gyöngyi, Pavel Berkhin, Hector Garcia-Molina, and Jan O. Pedersen. 2006. Link Spam Detection Based on Mass Estimation. In VLDB. 439--450.
[20]
Guanhao Hou, Xingguang Chen, Sibo Wang, and Zhewei Wei. 2021. Massively Parallel Algorithms for Personalized PageRank. PVLDB 14, 9 (2021), 1668--1680.
[21]
Glen Jeh and Jennifer Widom. 2003. Scaling personalized web search. In WWW. 271--279.
[22]
Jinhong Jung, Namyong Park, Lee Sael, and U Kang. 2017. BePI: Fast and Memory-Efficient Method for Billion-Scale Random Walk with Restart. In SIGMOD. 789--804.
[23]
Dandan Lin, Raymond Chi-Wing Wong, Min Xie, and Victor Junqiu Wei. 2020. Index-Free Approach with Theoretical Guarantee for Efficient Random Walk with Restart Query. In ICDE. 913--924.
[24]
Wenqing Lin. 2019. Distributed Algorithms for Fully Personalized PageRank on Large Graphs. In WWW. 1084--1094.
[25]
Peter Lofgren, Siddhartha Banerjee, and Ashish Goel. 2015. Bidirectional PageRank Estimation: From Average-Case to Worst-Case. In WAW 2015. 164--176.
[26]
Peter Lofgren, Siddhartha Banerjee, Ashish Goel, and Comandur Seshadhri. 2014. Fast-ppr: Scaling personalized pagerank estimation for large graphs. In KDD. 1436--1445.
[27]
Siqiang Luo. 2019. Distributed PageRank Computation: An Improved Theoretical Study. In AAAI. 4496--4503.
[28]
Takanori Maehara, Takuya Akiba, Yoichi Iwata, and Ken ichi Kawarabayashi. 2014. Computing personalized PageRank quickly by exploiting graph structures. PVLDB 7, 12 (2014), 1023--1034.
[29]
Dingheng Mo and Siqiang Luo. 2021. Agenda: Robust Personalized PageRanks in Evolving Graphs. In CIKM. 1315--1324.
[30]
Naoto Ohsaka, Takanori Maehara, and Ken-ichi Kawarabayashi. 2015. Efficient PageRank Tracking in Evolving Networks. In SIGKDD. 875--884.
[31]
Mingdong Ou, Peng Cui, Jian Pei, Ziwei Zhang, and Wenwu Zhu. 2016. Asymmetric Transitivity Preserving Graph Embedding. In KDD. 1105--1114.
[32]
Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1999. The PageRank citation ranking: bringing order to the web. Technical Report. Stanford InfoLab.
[33]
Atish Das Sarma, Anisur Rahaman Molla, Gopal Pandurangan, and Eli Upfal. 2013. Fast Distributed PageRank Computation. In ICDCN. 11--26.
[34]
Jieming Shi, Renchi Yang, Tianyuan Jin, Xiaokui Xiao, and Yin Yang. 2019. Realtime Top-k Personalized PageRank over Large Graphs on GPUs. PVLDB 13, 1 (2019), 15--28.
[35]
Kijung Shin, Jinhong Jung, Lee Sael, and U. Kang. 2015. BEAR: Block Elimination Approach for Random Walk with Restart on Large Graphs. In SIGMOD. 1571--1585.
[36]
Anton Tsitsulin, Davide Mottin, Panagiotis Karras, and Emmanuel Müller. 2018. VERSE: Versatile Graph Embeddings from Similarity Measures. In WWW. 539--548.
[37]
Hanzhi Wang, Zhewei Wei, Junhao Gan, Sibo Wang, and Zengfeng Huang. 2020. Personalized PageRank to a Target Node, Revisited. In SIGKDD. 657--667.
[38]
Runhui Wang, Sibo Wang, and Xiaofang Zhou. 2019. Parallelizing approximate single-source personalized PageRank queries on shared memory. VLDB J. 28, 6 (2019), 923--940.
[39]
Sibo Wang, Youze Tang, Xiaokui Xiao, Yin Yang, and Zengxiang Li. 2016. HubPPR: Effective Indexing for Approximate Personalized PageRank. PVLDB 10, 3 (2016), 205--216.
[40]
Sibo Wang and Yufei Tao. 2018. Efficient Algorithms for Finding Approximate Heavy Hitters in Personalized PageRanks. In SIGMOD. 1113--1127.
[41]
Sibo Wang, Renchi Yang, Runhui Wang, Xiaokui Xiao, Zhewei Wei, Wenqing Lin, Yin Yang, and Nan Tang. 2019. Efficient Algorithms for Approximate Single-Source Personalized PageRank Queries. TODS 44, 4 (2019), 18:1--18:37.
[42]
Sibo Wang, Renchi Yang, Xiaokui Xiao, Zhewei Wei, and Yin Yang. 2017. FORA: Simple and Effective Approximate Single-Source Personalized PageRank. In SIGKDD. 505--514.
[43]
Zhewei Wei, Xiaodong He, Xiaokui Xiao, Sibo Wang, Shuo Shang, and Ji-Rong Wen. 2018. TopPPR: Top-k Personalized PageRank Queries with Precision Guarantees on Large Graphs. In SIGMOD. 441--456.
[44]
Hao Wu, Junhao Gan, Zhewei Wei, and Rui Zhang. 2021. Unifying the Global and Local Approaches: An Efficient Power Iteration with Forward Push. In SIGMOD. 1996--2008.
[45]
Renchi Yang, Jieming Shi, Xiaokui Xiao, Yin Yang, and Sourav S. Bhowmick. 2020. Homogeneous Network Embedding for Massive Graphs via Reweighted Personalized PageRank. Proc. VLDB Endow. 13, 5 (2020), 670--683.
[46]
Yuan Yin and Zhewei Wei. 2019. Scalable Graph Embeddings via Sparse Transpose Proximities. In KDD. 1429--1437.
[47]
Hongyang Zhang, Peter Lofgren, and Ashish Goel. 2016. Approximate Personalized PageRank on Dynamic Graphs. In KDD. 1315--1324.
[48]
Xingyi Zhang, Kun Xie, Sibo Wang, and Zengfeng Huang. 2021. Learning Based Proximity Matrix Factorization for Node Embedding. In KDD. 2243--2253.
[49]
Fanwei Zhu, Yuan Fang, Kevin Chen-Chuan Chang, and Jing Ying. 2013. Incremental and Accuracy-Aware Personalized PageRank through Scheduled Approximation. PVLDB 6, 6 (2013), 481--492.

Cited By

View all
  • (2024)BIRD: Efficient Approximation of Bidirectional Hidden Personalized PageRankProceedings of the VLDB Endowment10.14778/3665844.366585517:9(2255-2268)Online publication date: 1-May-2024
  • (2024)Efficient Approximation of Kemeny's Constant for Large GraphsProceedings of the ACM on Management of Data10.1145/36549372:3(1-26)Online publication date: 30-May-2024
  • (2024)DiffusionE: Reasoning on Knowledge Graphs via Diffusion-based Graph Neural NetworksProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671997(222-230)Online publication date: 25-Aug-2024
  • Show More Cited By

Index Terms

  1. Personalized PageRank on Evolving Graphs with an Incremental Index-Update Scheme

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the ACM on Management of Data
    Proceedings of the ACM on Management of Data  Volume 1, Issue 1
    PACMMOD
    May 2023
    2807 pages
    EISSN:2836-6573
    DOI:10.1145/3603164
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 30 May 2023
    Published in PACMMOD Volume 1, Issue 1

    Permissions

    Request permissions for this article.

    Author Tags

    1. evolving or dynamic graphs
    2. personalized pagerank

    Qualifiers

    • Research-article

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)173
    • Downloads (Last 6 weeks)9
    Reflects downloads up to 02 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)BIRD: Efficient Approximation of Bidirectional Hidden Personalized PageRankProceedings of the VLDB Endowment10.14778/3665844.366585517:9(2255-2268)Online publication date: 1-May-2024
    • (2024)Efficient Approximation of Kemeny's Constant for Large GraphsProceedings of the ACM on Management of Data10.1145/36549372:3(1-26)Online publication date: 30-May-2024
    • (2024)DiffusionE: Reasoning on Knowledge Graphs via Diffusion-based Graph Neural NetworksProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671997(222-230)Online publication date: 25-Aug-2024
    • (2024)Fast Computation of Kemeny's Constant for Directed GraphsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671859(3472-3483)Online publication date: 25-Aug-2024
    • (2024)Towards Deeper Understanding of PPR-based Embedding Approaches: A Topological PerspectiveProceedings of the ACM Web Conference 202410.1145/3589334.3645663(969-979)Online publication date: 13-May-2024
    • (2024)DPAR: Decoupled Graph Neural Networks with Node-Level Differential PrivacyProceedings of the ACM Web Conference 202410.1145/3589334.3645531(1170-1181)Online publication date: 13-May-2024
    • (2024)Efficient Algorithms for Personalized PageRank Computation: A SurveyIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.337600036:9(4582-4602)Online publication date: Sep-2024
    • (2024)Fast Personalized PageRank for Customized Analysis Range Using Static Index2024 Fifteenth International Conference on Ubiquitous and Future Networks (ICUFN)10.1109/ICUFN61752.2024.10625115(304-309)Online publication date: 2-Jul-2024
    • (2023)Efficient Dynamic Weighted Set Sampling and Its ExtensionProceedings of the VLDB Endowment10.14778/3617838.361784017:1(15-27)Online publication date: 1-Sep-2023
    • (2023)Scalable Approximate Butterfly and Bi-triangle Counting for Large Bipartite NetworksProceedings of the ACM on Management of Data10.1145/36267531:4(1-26)Online publication date: 12-Dec-2023
    • Show More Cited By

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media