Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Efficient Resistance Distance Computation: The Power of Landmark-based Approaches

Published: 30 May 2023 Publication History

Abstract

Resistance distance is a fundamental metric to measure the similarity between two nodes in graphs which has been widely used in many real-world applications. In this paper, we study two problems on approximately computing resistance distance: (i) single-pair query which aims at calculating the resistance distance r(s, t) for a given pair of nodes (s, t); and (ii) single-source query which is to compute all the resistance distances r(s, u) for all nodes u in the graph with a given source node s. Existing algorithms for these two resistance distance query problems are often costly on large graphs. To efficiently solve these problems, we first establish several interesting connections among resistance distance, a new concept called v-absorbed random walk, random spanning forests, and a newly-developed v-absorbed push procedure. Based on such new connections, we propose three novel and efficient sampling-based algorithms as well as a deterministic algorithm for single-pair query; and we develop an online and two index-based approximation algorithms for single-source query. We show that the two index-based algorithms for single-source query take almost the same running time as the algorithms for single-pair query with the aid of a linear-size index. The striking feature of all our algorithms is that they are allowed to select an easy-to-hit node by random walks on the graph. Such an easy-to-hit landmark node v can make the v-absorbed random walk sampling, spanning tree sampling, as well as the v-absorbed push more efficient, thus significantly improving the performance of our algorithms. Extensive experiments on 5 real-life datasets show that our algorithms substantially outperform the state-of-the-art algorithms for two resistance distance query problems in terms of both running time and estimation errors.

Supplemental Material

MP4 File
A video presented for the SIGMOD 2023 paper "Efficient Resistance Distance Computation: the Power of Landmark-based Approaches ". Resistance distance is a fundamental metric to measure the similarity between two nodes in graphs which has been widely used in many real-world applications. In this paper, we study two problems on approximately computing resistance distance: (i) single-pair query which aims at calculating the resistance distance r(s,t) for a given pair of nodes (s,t); and (ii) single-source query which is to compute all the resistance distances r(s,u) for all nodes u in the graph with a given source node s. Existing algorithms for these two resistance distance query problems are often costly on large graphs. To efficiently solve these problems, we first establish several interesting connections among resistance distance, a new concept called v-absorbed random walk, random spanning forests, and a newly-developed v-absorbed push procedure. Based on such new connections, we propose three novel and efficient sampling-based algorithms as well as a deterministic algorithm for single-pair query; and we develop an online and two index-based approximation algorithms for single-source query. We show that the two index-based algorithms for single-source query take almost the same running time as the algorithms for single-pair query with the aid of a linear-size index. The striking feature of all our algorithms is that they are allowed to select an easy-to-hit node by random walks on the graph. Such an easy-to-hit landmark node v can make the v-absorbed random walk sampling, spanning tree sampling, as well as the v-absorbed push more efficient, thus significantly improving the performance of our algorithms. Extensive experiments on 5 real-life datasets show that our algorithms substantially outperform the state-of-the-art algorithms for two resistance distance query problems in terms of both running time and estimation errors.

References

[1]
2016. DBLP: DBLP Collaboration Network. http://dblp.uni-trier.de/~ley/db.
[2]
2022. Project WordGraph. http://www.ims.uni-stuttgart.de/en/research/projects/wordgraph/.
[3]
Vedat Levi Alev, Nima Anari, Lap Chi Lau, and Shayan Oveis Gharan. 2018. Graph Clustering using Effective Resistance. In 9th Innovations in Theoretical Computer Science Conference, ITCS.
[4]
Reid Andersen, Christian Borgs, Jennifer T. Chayes, John E. Hopcroft, Vahab S. Mirrokni, and Shang-Hua Teng. 2008. Local Computation of PageRank Contributions. Internet Math. (2008), 23--45.
[5]
Reid Andersen, Fan R. K. Chung, and Kevin J. Lang. 2006. Local Graph Partitioning using PageRank Vectors. In FOCS. 475--486.
[6]
Eugenio Angriman, Maria Predari, Alexander van der Grinten, and Henning Meyerhenke. 2020. Approximation of the Diagonal of a Laplacian's Pseudoinverse for Complex Network Analysis. In ESA.
[7]
Konstantin Avrachenkov, Nelly Litvak, Danil Nemirovsky, and Natalia Osipova. 2007. Monte Carlo Methods in PageRank Computation: When One Iteration is Sufficient. SIAM J. Numer. Anal. 45, 2 (2007), 890--904.
[8]
Ravindra B Bapat. 2010. Graphs and matrices. Vol. 27. Springer.
[9]
Wayne Barrett, Emily J. Evans, Amanda E. Francis, Mark Kempton, and John Sinkovic. 2020. Spanning 2-forests and resistance distance in 2-connected graphs. Discret. Appl. Math. 284 (2020), 341--352.
[10]
Pavel Berkhin. 2006. Bookmark-Coloring Approach to Personalized PageRank Computing. Internet Math. 3, 1 (2006), 41--62.
[11]
Béla Bollobás. 1998. Modern graph theory. Vol. 184. Springer Science & Business Media.
[12]
Enrico Bozzo and Massimo Franceschet. 2013. Resistance distance, closeness, and betweenness. Social Networks 35, 3 (2013), 460--469.
[13]
Seth Chaiken. 1982. A combinatorial proof of the all minors matrix tree theorem. SIAM Journal on Algebraic Discrete Methods 3, 3 (1982), 319--329.
[14]
Pavel Chebotarev and Elena Deza. 2020. Hitting time quasi-metric and its forest representation. Optim. Lett. (2020), 291--307.
[15]
Paul F. Christiano, Jonathan A. Kelner, Aleksander Madry, Daniel A. Spielman, and Shang-Hua Teng. 2011. Electrical flows, laplacian systems, and faster approximation of maximum flow in undirected graphs. In STOC.
[16]
Mustafa Coskun, Ananth Grama, and Mehmet Koyutürk. 2016. Efficient Processing of Network Proximity Queries via Chebyshev Acceleration. In KDD. 1515--1524.
[17]
Mustafa Coskun, Ananth Grama, and Mehmet Koyutürk. 2018. Indexed Fast Network Proximity Querying. VLDB 11, 8 (2018), 840--852.
[18]
Peter G. Doyle and J. Laurie Snell. 1984. Random Walks and Electrical Networks. Mathematical Association of America, Washington (1984).
[19]
François Fouss, Alain Pirotte, Jean-Michel Renders, and Marco Saerens. 2007. Random-Walk Computation of Similarities between Nodes of a Graph with Application to Collaborative Recommendation. IEEE Trans. Knowl. Data Eng. 19, 3 (2007), 355--369.
[20]
Massimo Franceschet and Enrico Bozzo. 2017. Approximations of the Generalized Inverse of the Graph Laplacian Matrix. Internet Math. (2017).
[21]
Takanori Hayashi, Takuya Akiba, and Yuichi Yoshida. 2016. Efficient Algorithms for Spanning Tree Centrality. In IJCAI. 3733--3739.
[22]
Glen Jeh and Jennifer Widom. 2002. SimRank: a measure of structural-context similarity. In KDD.
[23]
Glen Jeh and Jennifer Widom. 2003. Scaling personalized web search. In WWW. 271--279.
[24]
Jinhong Jung, Namyong Park, Lee Sael, and U Kang. 2017. BePI: Fast and Memory-Efficient Method for Billion-Scale Random Walk with Restart. In SIGMOD. 789--804.
[25]
Heung-Nam Kim and Abdulmotaleb El-Saddik. 2011. Personalized PageRank vectors for tag recommendations: inside FolkRank. In ACM Conference on Recommender Systems.
[26]
Jérôme Kunegis and Stephan Schmidt. 2007. Collaborative Filtering Using Electrical Resistance Network Models. In Industrial Conference on Data Mining.
[27]
Katz Leo. 1953. A new status index derived from sociometric analysis. Psychometrika 18, 1 (1953), 39--43.
[28]
Jure Leskovec and Andrej Krevl. 2014. SNAP Datasets: Stanford Large Network Dataset Collection. http://snap.stanford.edu/data.
[29]
Meihao Liao, Rong hua Li, Qiangqiang Dai, Hongyang Chen, Hongchao Qin, and Guoren Wang. 2023. Efficient Resistance Distance Computation: the Power of Landmark-based Approaches. Full version: https://github.com/mhliao516/Resistance-Landmark (2023).
[30]
Meihao Liao, Rong-Hua Li, Qiangqiang Dai, and Guoren Wang. 2022. Efficient Personalized PageRank Computation: A Spanning Forest Sampling based Approach. In SIGMOD. 1996--2008.
[31]
David Liben-Nowell and Jon M. Kleinberg. 2003. The link prediction problem for social networks. In CIKM.
[32]
Dandan Lin, Raymond Chi-Wing Wong, Min Xie, and Victor Junqiu Wei. 2020. Index-Free Approach with Theoretical Guarantee for Efficient Random Walk with Restart Query. In ICDE. 913--924.
[33]
Qin Liu, Zhenguo Li, John C. S. Lui, and Jiefeng Cheng. 2016. PowerWalk: Scalable Personalized PageRank via Random Walks with Vertex-Centric Decomposition. In CIKM. 195--204.
[34]
Peter Lofgren, Siddhartha Banerjee, and Ashish Goel. 2016. Personalized PageRank Estimation and Search: A Bidirectional Approach. In WSDM. 163--172.
[35]
Peter Lofgren and Ashish Goel. 2013. Personalized PageRank to a Target Node. CoRR abs/1304.4658 (2013). arXiv:1304.4658 http://arxiv.org/abs/1304.4658
[36]
László Lovász. 1993. Random walks on graphs. Combinatorics, Paul erdos is eighty 2, 1--46 (1993), 4.
[37]
Russell Lyons and Shayan Oveis Gharan. 2018. Sharp bounds on random walk eigenvalues via spectral embedding. International Mathematics Research Notices 2018, 24 (2018), 7555--7605.
[38]
Aleksander Madry, Damian Straszak, and Jakub Tarnawski. 2015. Fast Generation of Random Spanning Trees and the Effective Resistance Metric. In SODA. 2019--2036.
[39]
Fragkiskos D. Malliaros, Christos Giatsidis, Apostolos N. Papadopoulos, and Michalis Vazirgiannis. 2020. The core decomposition of networks: theory, algorithms and applications. VLDB (2020), 61--92.
[40]
Charalampos Mavroforakis, Richard Garcia-Lebron, Ioannis Koutis, and Evimaria Terzi. 2015. Spanning Edge Centrality: Large-scale Computation and Applications. In WWW. 732--742.
[41]
Qiaozhu Mei, Dengyong Zhou, and Kenneth Ward Church. 2008. Query suggestion using hitting time. In CIKM.
[42]
Eisha Nathan and David A. Bader. 2018. Incrementally updating Katz centrality in dynamic graphs. Soc. Netw. Anal. Min. 8, 1 (2018), 26.
[43]
Giannis Nikolentzos, Giannis Siglidis, and Michalis Vazirgiannis. 2021. Graph Kernels: A Survey. J. Artif. Intell. Res. 72 (2021), 943--1027.
[44]
Pan Peng, Daniel Lopatta, Yuichi Yoshida, and Gramoz Goranci. 2021. Local Algorithms for Estimating Effective Resistance. In KDD. 1329--1338.
[45]
James Gary Propp and David Bruce Wilson. 1998. How to get a perfectly random sample from a generic Markov chain and generate a random spanning tree of a directed graph. Journal of Algorithms 27, 2 (1998), 170--217.
[46]
Purnamrita Sarkar, Andrew W. Moore, and Amit Prakash. 2008. Fast incremental proximity search in large graphs. In ICML.
[47]
Tamás Sarlós, András A. Benczúr, Károly Csalogány, Dániel Fogaras, and Balázs Rácz. 2006. To randomize or not to randomize: space optimal summaries for hyperlink analysis. In WWW. 297--306.
[48]
Aaron Schild, Satish Rao, and Nikhil Srivastava. 2018. Localization of Electrical Flows. In SODA, Artur Czumaj (Ed.).
[49]
Jieming Shi, Tianyuan Jin, Renchi Yang, Xiaokui Xiao, and Yin Yang. 2020. Realtime Index-Free Single Source SimRank Processing on Web-Scale Graphs. Proc. VLDB Endow. 13, 7 (2020), 966--978.
[50]
Kijung Shin, Jinhong Jung, Lee Sael, and U Kang. 2015. BEAR: Block Elimination Approach for Random Walk with Restart on Large Graphs. In SIGMOD, Timos K. Sellis, Susan B. Davidson, and Zachary G. Ives (Eds.). 1571--1585.
[51]
Ali Kemal Sinop, Lisa Fawcett, Sreenivas Gollapudi, and Kostas Kollias. 2021. Robust Routing Using Electrical Flows. In SIGSPATIAL '21: 29th International Conference on Advances in Geographic Information Systems.
[52]
Daniel A. Spielman and Nikhil Srivastava. 2008. Graph sparsification by effective resistances. In STOC.
[53]
Prasad Tetali. 1991. Random walks and the effective resistance of networks. Journal of Theoretical Probability 4, 1 (1991), 101--109.
[54]
Ulrike von Luxburg, Agnes Radl, and Matthias Hein. 2010. Getting lost in space: Large sample analysis of the resistance distance. In NIPS. 2622--2630.
[55]
Ulrike Von Luxburg, Agnes Radl, and Matthias Hein. 2010. Hitting and commute times in large graphs are often misleading. arXiv:1003.1266 (2010).
[56]
Hanzhi Wang, Zhewei Wei, Junhao Gan, Sibo Wang, and Zengfeng Huang. 2020. Personalized PageRank to a Target Node, Revisited. In KDD. 657--667.
[57]
Shuguang Wang and Milos Hauskrecht. 2010. Effective query expansion with the resistance distance based term similarity metric. In SIGIR.
[58]
Sibo Wang, Youze Tang, Xiaokui Xiao, Yin Yang, and Zengxiang Li. 2016. HubPPR: Effective Indexing for Approximate Personalized PageRank. VLDB 10, 3 (2016), 205--216.
[59]
Sibo Wang, Renchi Yang, Runhui Wang, Xiaokui Xiao, Zhewei Wei, Wenqing Lin, Yin Yang, and Nan Tang. 2019. Efficient Algorithms for Approximate Single-Source Personalized PageRank Queries. TODS (2019), 18:1--18:37.
[60]
Sibo Wang, Renchi Yang, Xiaokui Xiao, Zhewei Wei, and Yin Yang. 2017. FORA: Simple and Effective Approximate Single-Source Personalized PageRank. In KDD. 505--514.
[61]
David Bruce Wilson. 1996. Generating Random Spanning Trees More Quickly than the Cover Time. In STOC.
[62]
Hao Wu, Junhao Gan, Zhewei Wei, and Rui Zhang. 2021. Unifying the Global and Local Approaches: An Efficient Power Iteration with Forward Push. In SIGMOD. 1996--2008.
[63]
Minji Yoon, Jinhong Jung, and U Kang. 2018. TPA: Fast, Scalable, and Accurate Method for Approximate Random Walk with Restart on Billion Scale Graphs. In ICDE. 1132--1143.
[64]
Zhen Zhang, Mianzhi Wang, Yijian Xiang, Yan Huang, and Arye Nehorai. 2018. RetGK: Graph Kernels based on Return Probabilities of Random Walks. In NeurIPS.

Cited By

View all
  • (2024)BIRD: Efficient Approximation of Bidirectional Hidden Personalized PageRankProceedings of the VLDB Endowment10.14778/3665844.366585517:9(2255-2268)Online publication date: 6-Aug-2024
  • (2024)Efficient Approximation of Kemeny's Constant for Large GraphsProceedings of the ACM on Management of Data10.1145/36549372:3(1-26)Online publication date: 30-May-2024
  • (2024)Efficient Computation for Diagonal of Forest Matrix via Variance-Reduced Forest SamplingProceedings of the ACM Web Conference 202410.1145/3589334.3645578(792-802)Online publication date: 13-May-2024
  • Show More Cited By

Index Terms

  1. Efficient Resistance Distance Computation: The Power of Landmark-based Approaches

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image Proceedings of the ACM on Management of Data
      Proceedings of the ACM on Management of Data  Volume 1, Issue 1
      PACMMOD
      May 2023
      2807 pages
      EISSN:2836-6573
      DOI:10.1145/3603164
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 30 May 2023
      Published in PACMMOD Volume 1, Issue 1

      Permissions

      Request permissions for this article.

      Author Tags

      1. approximate algorithm
      2. graph proximity
      3. resistance distance

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)91
      • Downloads (Last 6 weeks)6
      Reflects downloads up to 27 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)BIRD: Efficient Approximation of Bidirectional Hidden Personalized PageRankProceedings of the VLDB Endowment10.14778/3665844.366585517:9(2255-2268)Online publication date: 6-Aug-2024
      • (2024)Efficient Approximation of Kemeny's Constant for Large GraphsProceedings of the ACM on Management of Data10.1145/36549372:3(1-26)Online publication date: 30-May-2024
      • (2024)Efficient Computation for Diagonal of Forest Matrix via Variance-Reduced Forest SamplingProceedings of the ACM Web Conference 202410.1145/3589334.3645578(792-802)Online publication date: 13-May-2024
      • (2024)Resistance Eccentricity in Graphs: Distribution, Computation and Optimization2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00315(4113-4126)Online publication date: 13-May-2024

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media