Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3448016.3457298acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Unifying the Global and Local Approaches: An Efficient Power Iteration with Forward Push

Published: 18 June 2021 Publication History

Abstract

Personalized PageRank (PPR) is a critical measure of the importance of a node t to a source node s in a graph. The Single-Source PPR (SSPPR) query computes the PPR's of all the nodes with respect to s on a directed graph G with n nodes and m edges; and it is an essential operation widely used in graph applications. In this paper, we propose novel algorithms for answering two variants of SSPPR queries: (i) high-precision queries and (ii) approximate queries.
For high-precision queries, Power Iteration (PowItr) and Forward Push (FwdPush) are two fundamental approaches. Given an absolute error threshold λ (which is typically set to as small as 10-8), the only known bound of FwdPush is O(m/λ), much worse than the O(m log 1/λ)-bound of PowItr. Whether FwdPush can achieve the same running time bound as PowItr does still remains an open question in the research community. We give a positive answer to this question. We show that the running time of a common implementation of FwdPush is actually bounded by O(m · log 1/λ). Based on this finding, we propose a new algorithm, called Power Iteration with Forward Push (PowerPush), which incorporates the strengths of both PowItr and FwdPush.
For approximate queries (with a relative error ε), we propose a new algorithm, called SpeedPPR, with overall expected time bounded by $O(n · log n · log 1/ε) on scale-free graphs. This improves the state-of-the-art O((n · log n)/ε) bound.
We conduct extensive experiments on six real datasets. The experimental results show that PowerPush outperforms the state-of-the-art high-precision algorithm BePi by up to an order of magnitude in both efficiency and accuracy. Furthermore, our SpeedPPR also outperforms the state-of-the-art approximate algorithm FORA by up to an order of magnitude in all aspects including query time, accuracy, pre-processing time as well as index size.

Supplementary Material

MP4 File (3448016.3457298.mp4)
Personalized PageRank (PPR) is a critical measure on the importance of a node $t$ to a source node $s$ in a graph. A Single-Source PPR (SSPPR) query computes the PPR's of all the nodes with respect to $s$ on a directed graph $G$ with $n$ nodes and $m$ edges, and it is an essential operation widely used in graph applications. In this paper, we propose novel algorithms for solving two variants of SSPPR: (i) high-precision queries and (ii) approximate queries. For the high-precision queries, Power Iteration (PowItr) and Forward Push (FwdPush) are two fundamental approaches. Given an absolute error threshold $\lamda$, the only known bound of FwdPush is $O(\frac{m}{\lamda})$, much worse than the $O(m \log \frac{1}{\lamda})$-bound of PowItr. Whether FwdPush can achieve the same running time bound as PowItr does still remains an open question in the research community. We give a positive answer to this question by showing that the running time of a common implementation of FwdPush is actually bounded by $O(m \cdot \log \frac{1}{\lamda})$. Based on this finding, we propose a new algorithm, called Power Iteration with Forward Push (PowForPush), which incorporates both strengths of PowItr and FwdPush. For approximate queries (with a relative error $\eps$), we propose a new algorithm, called SpeedPPR, with overall expected time bounded by $O(n \cdot \log n \cdot \log \frac{1}{\eps})$ on scale-free graphs. This bound greatly improves the $O(\frac{n \cdot \log n}{\eps})$ bound of a state-of-the-art algorithm FORA. We conduct extensive experiments on six real datasets. The experimental results show that PowForPush outperforms the state-of-the-art algorithm BePi by up to an order of magnitude in both efficiency and accuracy. Furthermore, our SpeedPPR also outperforms FORA by up to an order of magnitude in all aspects includes query time, accuracy, pre-processing time as well as index size.

References

[1]
Reid Andersen, Christian Borgs, Jennifer T. Chayes, John E. Hopcroft, Vahab S. Mirrokni, and Shang-Hua Teng. 2007. Local Computation of PageRank Contributions. In WAW. 150--165.
[2]
Reid Andersen, Fan R. K. Chung, and Kevin J. Lang. 2006. Local Graph Partitioning using PageRank Vectors. In FOCS. 475--486.
[3]
Lars Backstrom, Daniel P. Huttenlocher, Jon M. Kleinberg, and Xiangyang Lan. 2006. Group formation in large social networks: membership, growth, and evolution. In Proceedings of the Twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Philadelphia, PA, USA, August 20--23, 2006, Tina Eliassi-Rad, Lyle H. Ungar, Mark Craven, and Dimitrios Gunopulos (Eds.). ACM, 44--54.
[4]
Lars Backstrom and Jure Leskovec. 2011. Supervised random walks: predicting and recommending links in social networks. In WSDM. 635--644.
[5]
Bahman Bahmani, Kaushik Chakrabarti, and Dong Xin. 2011. Fast personalized PageRank on MapReduce. In SIGMOD. 973--984.
[6]
Bahman Bahmani, Abdur Chowdhury, and Ashish Goel. 2010. Fast incremental and personalized pagerank. VLDB, Vol. 4, 3 (2010), 173--184.
[7]
Pavel Berkhin. 2005. Survey: A Survey on PageRank Computing. Internet Math., Vol. 2, 1 (2005), 73--120.
[8]
Soumen Chakrabarti. 2007. Dynamic personalized pagerank in entity-relation graphs. In WWW. 571--580.
[9]
Fan Chung and Linyuan Lu. 2006. Concentration inequalities and martingale inequalities: a survey. Internet Math., Vol. 3, 1 (2006), 79--127.
[10]
Mustafa Coskun, Ananth Grama, and Mehmet Koyuturk. 2016. Efficient processing of network proximity queries via chebyshev acceleration. In KDD. 1515--1524.
[11]
Dániel Fogaras, Balázs Rácz, Károly Csalogány, and Tamás Sarlós. 2005. Towards scaling fully personalized pagerank: Algorithms, lower bounds, and experiments. Internet Mathematics, Vol. 2, 3 (2005), 333--358.
[12]
Yasuhiro Fujiwara, Makoto Nakatsuji, Makoto Onizuka, and Masaru Kitsuregawa. 2012a. Fast and Exact Top-k Search for Random Walk with Restart. PVLDB, Vol. 5, 5 (2012), 442--453.
[13]
Yasuhiro Fujiwara, Makoto Nakatsuji, Hiroaki Shiokawa, Takeshi Mishima, and Makoto Onizuka. 2013a. Efficient ad-hoc search for personalized PageRank. In SIGMOD. 445--456.
[14]
Yasuhiro Fujiwara, Makoto Nakatsuji, Hiroaki Shiokawa, Takeshi Mishima, and Makoto Onizuka. 2013b. Fast and Exact Top-k Algorithm for PageRank. In AAAI.
[15]
Yasuhiro Fujiwara, Makoto Nakatsuji, Takeshi Yamamuro, Hiroaki Shiokawa, and Makoto Onizuka. 2012b. Efficient personalized pagerank with accuracy assurance. In KDD. 15--23.
[16]
Tao Guo, Xin Cao, Gao Cong, Jiaheng Lu, and Xuemin Lin. 2017. Distributed Algorithms on Exact Personalized PageRank. In SIGMOD. 479--494.
[17]
Manish S. Gupta, Amit Pathak, and Soumen Chakrabarti. 2008. Fast algorithms for top-k personalized pagerank queries. In WWW. 1225--1226.
[18]
Glen Jeh and Jennifer Widom. 2003. Scaling personalized web search. In WWW. 271--279.
[19]
Jinhong Jung, Namyong Park, Sael Lee, and U Kang. 2017. BePI: Fast and Memory-Efficient Method for Billion-Scale Random Walk with Restart. In SIGMOD. 789--804.
[20]
Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue B. Moon. 2010. What is Twitter, a social network or a news media?. In Proceedings of the 19th International Conference on World Wide Web, WWW 2010, Raleigh, North Carolina, USA, April 26--30, 2010, Michael Rappa, Paul Jones, Juliana Freire, and Soumen Chakrabarti (Eds.). ACM, 591--600.
[21]
Jure Leskovec, Kevin J. Lang, Anirban Dasgupta, and Michael W. Mahoney. 2009. Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters. Internet Math., Vol. 6, 1 (2009), 29--123.
[22]
Dandan Lin, Raymond Chi-Wing Wong, Min Xie, and Victor Junqiu Wei. 2020. Index-Free Approach with Theoretical Guarantee for Efficient Random Walk with Restart Query. In 2020 IEEE 36th International Conference on Data Engineering (ICDE). IEEE, 913--924.
[23]
Peter Lofgren, Siddhartha Banerjee, and Ashish Goel. 2015. Bidirectional PageRank Estimation: From Average-Case to Worst-Case. In WAW. 164--176.
[24]
Peter Lofgren, Siddhartha Banerjee, and Ashish Goel. 2016. Personalized pagerank estimation and search: A bidirectional approach. In WSDM. 163--172.
[25]
Peter A Lofgren, Siddhartha Banerjee, Ashish Goel, and C Seshadhri. 2014. Fast-ppr: Scaling personalized pagerank estimation for large graphs. In KDD. 1436--1445.
[26]
Takanori Maehara, Takuya Akiba, Yoichi Iwata, and Ken-ichi Kawarabayashi. 2014. Computing personalized PageRank quickly by exploiting graph structures. PVLDB, Vol. 7, 12 (2014), 1023--1034.
[27]
Naoto Ohsaka, Takanori Maehara, and Ken-ichi Kawarabayashi. 2015. Efficient PageRank Tracking in Evolving Networks. In KDD. 875--884.
[28]
Mingdong Ou, Peng Cui, Jian Pei, Ziwei Zhang, and Wenwu Zhu. 2016. Asymmetric Transitivity Preserving Graph Embedding. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13--17, 2016, Balaji Krishnapuram, Mohak Shah, Alexander J. Smola, Charu C. Aggarwal, Dou Shen, and Rajeev Rastogi (Eds.). ACM, 1105--1114.
[29]
Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1999. The PageRank citation ranking: bringing order to the web. (1999).
[30]
CH Ren, Luyi Mo, CM Kao, CK Cheng, and DWL Cheung. 2014. CLUDE: An Efficient Algorithm for LU Decomposition Over a Sequence of Evolving Graphs. In EDBT .
[31]
Atish Das Sarma, Anisur Rahaman Molla, Gopal Pandurangan, and Eli Upfal. 2015. Fast distributed pagerank computation. Theoretical Computer Science, Vol. 561 (2015), 113--121.
[32]
Kijung Shin, Jinhong Jung, Lee Sael, and U. Kang. 2015. BEAR: Block Elimination Approach for Random Walk with Restart on Large Graphs. In SIGMOD. 1571--1585.
[33]
L. Takac and Michal Zábovský. 2012. Data analysis in public social networks. International Scientific Conference and International Workshop Present Day Trends of Innovations (01 2012), 1--6.
[34]
Anton Tsitsulin, Davide Mottin, Panagiotis Karras, and Emmanuel Mü ller. 2018. VERSE: Versatile Graph Embeddings from Similarity Measures. In Proceedings of the 2018 World Wide Web Conference on World Wide Web, WWW 2018, Lyon, France, April 23--27, 2018, Pierre-Antoine Champin, Fabien L. Gandon, Mounia Lalmas, and Panagiotis G. Ipeirotis (Eds.). ACM, 539--548.
[35]
Sibo Wang, Youze Tang, Xiaokui Xiao, Yin Yang, and Zengxiang Li. 2016. HubPPR: Effective Indexing for Approximate Personalized PageRank. PVLDB, Vol. 10, 3 (2016), 205--216.
[36]
Sibo Wang, Renchi Yang, Runhui Wang, Xiaokui Xiao, Zhewei Wei, Wenqing Lin, Yin Yang, and Nan Tang. 2019. Efficient Algorithms for Approximate Single-Source Personalized PageRank Queries. ACM Trans. Database Syst., Vol. 44, 4 (2019), 18:1--18:37.
[37]
Sibo Wang, Renchi Yang, Xiaokui Xiao, Zhewei Wei, and Yin Yang. 2017. FORA: Simple and Effective Approximate Single-Source Personalized PageRank. In KDD. 505--514.
[38]
Zhewei Wei, Xiaodong He, Xiaokui Xiao, Sibo Wang, Shuo Shang, and Ji-Rong Wen. 2018. Topppr: top-k personalized pagerank queries with precision guarantees on large graphs. In Proceedings of the 2018 International Conference on Management of Data. 441--456.
[39]
Yubao Wu, Ruoming Jin, and Xiang Zhang. 2014. Fast and unified local search for random walk based k-nearest-neighbor query in large graphs. In SIGMOD 2014. 1139--1150.
[40]
Jaewon Yang and Jure Leskovec. 2012. Defining and Evaluating Network Communities Based on Ground-Truth. In 12th IEEE International Conference on Data Mining, ICDM 2012, Brussels, Belgium, December 10--13, 2012, Mohammed Javeed Zaki, Arno Siebes, Jeffrey Xu Yu, Bart Goethals, Geoffrey I. Webb, and Xindong Wu (Eds.). IEEE Computer Society, 745--754.
[41]
Yuan Yin and Zhewei Wei. 2019. Scalable Graph Embeddings via Sparse Transpose Proximities. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019, Anchorage, AK, USA, August 4--8, 2019, Ankur Teredesai, Vipin Kumar, Ying Li, Ró mer Rosales, Evimaria Terzi, and George Karypis (Eds.). ACM, 1429--1437.
[42]
Weiren Yu and Xuemin Lin. 2013. IRWR: incremental random walk with restart. In SIGIR. 1017--1020.
[43]
Weiren Yu and Julie A. McCann. 2016. Random Walk with Restart over Dynamic Graphs. In ICDM. 589--598.
[44]
Hongyang Zhang, Peter Lofgren, and Ashish Goel. 2016. Approximate Personalized PageRank on Dynamic Graphs. In KDD. 1315--1324.
[45]
Kai Zhang, Yaokang Zhu, Jun Wang, and Jie Zhang. 2020. Adaptive Structural Fingerprints for Graph Attention Networks. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26--30, 2020. OpenReview.net.
[46]
Fanwei Zhu, Yuan Fang, Kevin Chen-Chuan Chang, and Jing Ying. 2013. Incremental and Accuracy-Aware Personalized PageRank through Scheduled Approximation. PVLDB, Vol. 6, 6 (2013), 481--492.

Cited By

View all

Index Terms

  1. Unifying the Global and Local Approaches: An Efficient Power Iteration with Forward Push

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGMOD '21: Proceedings of the 2021 International Conference on Management of Data
    June 2021
    2969 pages
    ISBN:9781450383431
    DOI:10.1145/3448016
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 18 June 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. forward push
    2. personalized pagerank
    3. power iteration

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    SIGMOD/PODS '21
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 785 of 4,003 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)65
    • Downloads (Last 6 weeks)6
    Reflects downloads up to 04 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)BIRD: Efficient Approximation of Bidirectional Hidden Personalized PageRankProceedings of the VLDB Endowment10.14778/3665844.366585517:9(2255-2268)Online publication date: 1-May-2024
    • (2024)QTCS: Efficient Query-Centered Temporal Community SearchProceedings of the VLDB Endowment10.14778/3648160.364816317:6(1187-1199)Online publication date: 3-May-2024
    • (2024)Efficient Approximation of Kemeny's Constant for Large GraphsProceedings of the ACM on Management of Data10.1145/36549372:3(1-26)Online publication date: 30-May-2024
    • (2024)Efficient and Provable Effective Resistance Computation on Large Graphs: An Index-based ApproachProceedings of the ACM on Management of Data10.1145/36549362:3(1-27)Online publication date: 30-May-2024
    • (2024)Fast Computation of Kemeny's Constant for Directed GraphsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671859(3472-3483)Online publication date: 25-Aug-2024
    • (2024)Efficient Algorithms for Personalized PageRank Computation: A SurveyIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.337600036:9(4582-4602)Online publication date: Sep-2024
    • (2024)Efficient Algorithms for Group Hitting Probability Queries on Large GraphsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.334916436:7(2995-3008)Online publication date: Jul-2024
    • (2024)Fast Personalized PageRank for Customized Analysis Range Using Static Index2024 Fifteenth International Conference on Ubiquitous and Future Networks (ICUFN)10.1109/ICUFN61752.2024.10625115(304-309)Online publication date: 2-Jul-2024
    • (2024)Personalized PageRanks over Dynamic Graphs - The Case for Optimizing Quality of Service2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00038(409-422)Online publication date: 13-May-2024
    • (2024)FICOM: an effective and scalable active learning framework for GNNs on semi-supervised node classificationThe VLDB Journal10.1007/s00778-024-00870-z33:5(1723-1742)Online publication date: 22-Jul-2024
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media