Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Efficient and Provable Effective Resistance Computation on Large Graphs: An Index-based Approach

Published: 30 May 2024 Publication History

Abstract

Effective resistance (ER) is a fundamental metric for measuring node similarities in a graph, and it finds applications in various domains including graph clustering, recommendation systems, link prediction, and graph neural networks. The state-of-the-art algorithm for computing effective resistance relies on a landmark technique, which involves selecting a node that is easy to reach by all the other nodes as a landmark. The performance of this technique heavily depends on the chosen landmark node. However, in many real-life graphs, it is not always possible to find an easily reachable landmark node, which can significantly hinder the algorithm's efficiency. To overcome this problem, we propose a novel multiple landmarks technique which involves selecting a set of landmark nodes Vl such that the other nodes in the graph can easily reach any one of a landmark node in Vl. Specifically, we first propose several new formulas to compute ER with multiple landmarks, utilizing the concept of Schur complement. These new formulas allow us to pre-compute and maintain several small-sized matrices related to Vl as a compact index. With this powerful index technique, we demonstrate that both single-pair and single-source ER queries can be efficiently answered using a newly-developed Vl-absorbed random walk sampling or Vl-absorbed push technique. Comprehensive theoretical analysis shows that all proposed index-based algorithms achieve provable performance guarantees for both single-pair and single-source ER queries. Extensive experiments on 5 real-life datasets demonstrate the high efficiency of our multiple landmarks-based index techniques. For instance, our algorithms, with a 1.5 GB index size, can be up to 4 orders of magnitude faster than the state-of-the-art algorithms while achieving the same accuracy on a large road network.

References

[1]
Tenindra Abeywickrama and Muhammad Aamir Cheema. 2017. Efficient Landmark-Based Candidate Generation for kNN Queries on Road Networks. In Database Systems for Advanced Applications: 22nd International Conference, DASFAA 2017, Suzhou, China, March 27--30, 2017, Proceedings, Part II 22. Springer, 425--440.
[2]
Tenindra Abeywickrama, Muhammad Aamir Cheema, and David Taniar. 2016. k-Nearest Neighbors on Road Networks: A Journey in Experimentation and In-Memory Implementation. Proceedings of the VLDB Endowment, Vol. 9, 6 (2016).
[3]
Florian Adriaens, Honglian Wang, and Aristides Gionis. 2023. Minimizing Hitting Time between Disparate Groups with Shortcut Edges. CoRR, Vol. abs/2306.03571 (2023).
[4]
Rafig Agaev and Pavel Chebotarev. 2006. Spanning Forests of a Digraph and Their Applications. CoRR, Vol. abs/math/0602061 (2006). arxiv: math/0602061
[5]
Takuya Akiba, Yoichi Iwata, and Yuichi Yoshida. 2013. Fast exact shortest-path distance queries on large networks by pruned landmark labeling. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data. 349--360.
[6]
David J Aldous. 1990. The random walk construction of uniform spanning trees and uniform labelled trees. SIAM Journal on Discrete Mathematics, Vol. 3, 4 (1990), 450--465.
[7]
Reid Andersen, Christian Borgs, Jennifer T. Chayes, John E. Hopcroft, Vahab S. Mirrokni, and Shang-Hua Teng. 2007. Local Computation of PageRank Contributions. In WAW. 150--165.
[8]
Reid Andersen, Fan R. K. Chung, and Kevin J. Lang. 2006. Local Graph Partitioning using PageRank Vectors. In FOCS. 475--486.
[9]
Eugenio Angriman, Maria Predari, Alexander van der Grinten, and Henning Meyerhenke. 2020. Approximation of the Diagonal of a Laplacian's Pseudoinverse for Complex Network Analysis. In 28th Annual European Symposium on Algorithms, ESA 2020, September 7--9, 2020, Pisa, Italy (Virtual Conference) (LIPIcs, Vol. 173). 6:1--6:24.
[10]
Luca Avena and Alexandre Gaudillière. 2018. Two applications of random spanning forests. Journal of Theoretical Probability, Vol. 31, 4 (2018), 1975--2004.
[11]
Konstantin Avrachenkov, Nelly Litvak, Danil Nemirovsky, and Natalia Osipova. 2007. Monte Carlo Methods in PageRank Computation: When One Iteration is Sufficient. SIAM J. Numer. Anal., Vol. 45, 2 (2007), 890--904.
[12]
Ravindra B Bapat. 2010. Graphs and matrices. Vol. 27. Springer.
[13]
Pavel Berkhin. 2006. Bookmark-Coloring Approach to Personalized PageRank Computing. Internet Math., Vol. 3, 1 (2006), 41--62.
[14]
Béla Bollobás. 1998. Modern graph theory. Vol. 184. Springer Science & Business Media.
[15]
Sergey Brin. 1995. Near neighbor search in large metric spaces. In VLDB, Vol. 95. 574--584.
[16]
Dongrun Cai, Xue Chen, and Pan Peng. 2023. Effective Resistances in Non-Expander Graphs. arXiv preprint arXiv:2307.01218 (2023).
[17]
Thomas H Cormen, Charles E Leiserson, Ronald L Rivest, and Clifford Stein. 2022. Introduction to algorithms. MIT press.
[18]
Mustafa Coskun, Ananth Grama, and Mehmet Koyutü rk. 2018. Indexed Fast Network Proximity Querying. VLDB, Vol. 11, 8 (2018), 840--852.
[19]
Rajat Vadiraj Dwaraknath, Ishani Karmarkar, and Aaron Sidford. 2023. Towards Optimal Effective Resistance Estimation. arXiv preprint arXiv:2306.14820 (2023).
[20]
Andrew V Goldberg and Chris Harrelson. 2005. Computing the shortest path: A search meets graph theory. In SODA, Vol. 5. 156--165.
[21]
Takanori Hayashi, Takuya Akiba, and Yuichi Yoshida. 2016. Efficient Algorithms for Spanning Tree Centrality. In IJCAI. 3733--3739.
[22]
Jinhong Jung, Namyong Park, Lee Sael, and U Kang. 2017. BePI: Fast and Memory-Efficient Method for Billion-Scale Random Walk with Restart. In SIGMOD. 789--804.
[23]
Kyle Kloster and David F. Gleich. 2014. Heat kernel based community detection. In KDD. ACM, 1386--1395.
[24]
Jure Leskovec and Andrej Krevl. 2014. SNAP Datasets: Stanford Large Network Dataset Collection. http://snap.stanford.edu/data.
[25]
Huan Li, Richard Peng, Liren Shan, Yuhao Yi, and Zhongzhi Zhang. 2019. Current Flow Group Closeness Centrality for Complex Networks?. In WWW. ACM, 961--971.
[26]
Lawrence Li and Sushant Sachdeva. 2023. A New Approach to Estimating Effective Resistances and Counting Spanning Trees in Expander Graphs. In Proceedings of the 2023 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). SIAM, 2728--2745.
[27]
Meihao Liao, Rong-Hua Li, Qiangqiang Dai, Hongyang Chen, Hongchao Qin, and Guoren Wang. 2023 a. Efficient Personalized PageRank Computation: The Power of Variance-Reduced Monte Carlo Approaches. Proc. ACM Manag. Data, Vol. 1, 2 (2023), 160:1--160:26.
[28]
Meihao Liao, Rong-Hua Li, Qiangqiang Dai, Hongyang Chen, Hongchao Qin, and Guoren Wang. 2023 b. Efficient Resistance Distance Computation: The Power of Landmark-based Approaches. Proc. ACM Manag. Data, Vol. 1, 1 (2023), 68:1--68:27.
[29]
Meihao Liao, Rong-Hua Li, Qiangqiang Dai, and Guoren Wang. 2022. Efficient Personalized PageRank Computation: A Spanning Forests Sampling Based Approach. In SIGMOD. ACM, 2048--2061.
[30]
Meihao Liao, Junjie Zhou, Rong-Hua Li, Qiangqiang Dai, Hongyang Chen, and Guoren Wang. 2024. Efficient and Provable Effective Resistance Computation on Large Graphs: an Index-based Approach. Full version: https://github.com/mhliao0516/EffectiveResistanceMultipleLandmark (2024).
[31]
Dandan Lin, Raymond Chi-Wing Wong, Min Xie, and Victor Junqiu Wei. 2020. Index-Free Approach with Theoretical Guarantee for Efficient Random Walk with Restart Query. In ICDE. 913--924.
[32]
Yang Liu, Chuan Zhou, Shirui Pan, Jia Wu, Zhao Li, Hongyang Chen, and Peng Zhang. 2023. CurvDrop: A Ricci Curvature Based Approach to Prevent Graph Neural Networks from Over-Smoothing and Over-Squashing. In WWW. ACM, 221--230.
[33]
Peter Lofgren, Siddhartha Banerjee, and Ashish Goel. 2016. Personalized PageRank Estimation and Search: A Bidirectional Approach. In WSDM. 163--172.
[34]
Peter Lofgren and Ashish Goel. 2013. Personalized PageRank to a Target Node. CoRR, Vol. abs/1304.4658 (2013). arxiv: 1304.4658 http://arxiv.org/abs/1304.4658
[35]
Takanori Maehara, Takuya Akiba, Yoichi Iwata, and Ken-ichi Kawarabayashi. 2014. Computing Personalized PageRank Quickly by Exploiting Graph Structures. VLDB, Vol. 7, 12 (2014), 1023--1034.
[36]
Shlomi Maliah, Rami Puzis, and Guy Shani. 2017. Shortest path tree sampling for landmark selection in large networks. Journal of Complex Networks, Vol. 5, 5 (2017), 795--815.
[37]
Charalampos Mavroforakis, Richard Garcia-Lebron, Ioannis Koutis, and Evimaria Terzi. 2015. Spanning Edge Centrality: Large-scale Computation and Applications. In WWW. 732--742.
[38]
Luisa Micó, José Oncina, and Rafael C Carrasco. 1996. A fast branch & bound nearest neighbour classifier in metric spaces. Pattern Recognition Letters, Vol. 17, 7 (1996), 731--739.
[39]
Mar'ia Luisa Micó, José Oncina, and Enrique Vidal. 1994. A new version of the nearest-neighbour approximating and eliminating search algorithm (AESA) with linear preprocessing time and memory requirements. Pattern Recognition Letters, Vol. 15, 1 (1994), 9--17.
[40]
Abedelaziz Mohaisen, Aaram Yun, and Yongdae Kim. 2010. Measuring the mixing time of social graphs. In Proceedings of the 10th ACM SIGCOMM conference on Internet measurement. 383--389.
[41]
Dian Ouyang, Lu Qin, Lijun Chang, Xuemin Lin, Ying Zhang, and Qing Zhu. 2018. When hierarchy meets 2-hop-labeling: Efficient shortest distance queries on road networks. In Proceedings of the 2018 International Conference on Management of Data. 709--724.
[42]
Pan Peng, Daniel Lopatta, Yuichi Yoshida, and Gramoz Goranci. 2021. Local Algorithms for Estimating Effective Resistance. In KDD. 1329--1338.
[43]
Jim Pitman and Wenpin Tang. 2018. Tree formulas, mean first passage times and Kemeny's constant of a Markov chain. (2018).
[44]
Michalis Potamias, Francesco Bonchi, Carlos Castillo, and Aristides Gionis. 2009. Fast shortest path distance estimation in large networks. In CIKM. 867--876.
[45]
Purnamrita Sarkar, Andrew W. Moore, and Amit Prakash. 2008. Fast incremental proximity search in large graphs. In ICML.
[46]
Aaron Schild. 2018. An almost-linear time algorithm for uniform random spanning tree generation. In STOC. 214--227.
[47]
Jieming Shi, Nikos Mamoulis, Dingming Wu, and David W. Cheung. 2014. Density-based place clustering in geo-social networks. In SIGMOD. ACM, 99--110.
[48]
Kijung Shin, Jinhong Jung, Lee Sael, and U Kang. 2015. BEAR: Block Elimination Approach for Random Walk with Restart on Large Graphs. In SIGMOD, Timos K. Sellis, Susan B. Davidson, and Zachary G. Ives (Eds.). 1571--1585.
[49]
Daniel A. Spielman and Nikhil Srivastava. 2008. Graph sparsification by effective resistances. In STOC. ACM, 563--568.
[50]
Kumar Sricharan and Kamalika Das. 2014. Localizing anomalous changes in time-evolving graphs. In SIGMOD. ACM, 1347--1358.
[51]
Prasad Tetali. 1991. Random walks and the effective resistance of networks. Journal of Theoretical Probability, Vol. 4, 1 (1991), 101--109.
[52]
Jake Topping, Francesco Di Giovanni, Benjamin Paul Chamberlain, Xiaowen Dong, and Michael M. Bronstein. 2022. Understanding over-squashing and bottlenecks on graphs via curvature. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25--29, 2022.
[53]
Konstantin Tretyakov, Abel Armas-Cervantes, Luciano Garc'ia-Ba nuelos, Jaak Vilo, and Marlon Dumas. 2011. Fast fully dynamic landmark-based estimation of shortest path distances in very large graphs. In Proceedings of the 20th ACM international conference on Information and knowledge management. 1785--1794.
[54]
Nimish Ukey, Zhengyi Yang, Binghao Li, Guangjian Zhang, Yiheng Hu, and Wenjie Zhang. 2023. Survey on exact knn queries over high-dimensional data space. Sensors, Vol. 23, 2 (2023), 629.
[55]
Hanzhi Wang, Zhewei Wei, Junhao Gan, Sibo Wang, and Zengfeng Huang. 2020. Personalized PageRank to a Target Node, Revisited. In KDD. 657--667.
[56]
Sibo Wang, Youze Tang, Xiaokui Xiao, Yin Yang, and Zengxiang Li. 2016. HubPPR: Effective Indexing for Approximate Personalized PageRank. VLDB, Vol. 10, 3 (2016), 205--216.
[57]
Sibo Wang and Yufei Tao. 2018. Efficient Algorithms for Finding Approximate Heavy Hitters in Personalized PageRanks. In SIGMOD. 1113--1127.
[58]
Sibo Wang, Renchi Yang, Xiaokui Xiao, Zhewei Wei, and Yin Yang. 2017. FORA: Simple and Effective Approximate Single-Source Personalized PageRank. In KDD. 505--514.
[59]
Zhewei Wei, Xiaodong He, Xiaokui Xiao, Sibo Wang, Shuo Shang, and Ji-Rong Wen. 2018. TopPPR: Top-k Personalized PageRank Queries with Precision Guarantees on Large Graphs. In SIGMOD. 441--456.
[60]
David Bruce Wilson. 1996. Generating random spanning trees more quickly than the cover time. In Proceedings of the twenty-eighth annual ACM symposium on Theory of computing. 296--303.
[61]
Hao Wu, Junhao Gan, Zhewei Wei, and Rui Zhang. 2021. Unifying the Global and Local Approaches: An Efficient Power Iteration with Forward Push. In SIGMOD. 1996--2008.
[62]
Renchi Yang and Jing Tang. 2023. Efficient Estimation of Pairwise Effective Resistance. Proc. ACM Manag. Data, Vol. 1, 1 (2023), 16:1--16:27.
[63]
Renchi Yang, Xiaokui Xiao, Zhewei Wei, Sourav S. Bhowmick, Jun Zhao, and Rong-Hua Li. 2019. Efficient Estimation of Heat Kernel PageRank for Local Clustering. In SIGMOD. ACM, 1339--1356.
[64]
Hongzhi Yin, Bin Cui, Jing Li, Junjie Yao, and Chen Chen. 2012. Challenging the Long Tail Recommendation. VLDB, Vol. 5, 9 (2012), 896--907.
[65]
Minji Yoon, Jinhong Jung, and U Kang. 2018. TPA: Fast, Scalable, and Accurate Method for Approximate Random Walk with Restart on Billion Scale Graphs. In ICDE. 1132--1143.
[66]
Junhua Zhang, Wentao Li, Long Yuan, Lu Qin, Ying Zhang, and Lijun Chang. 2022. Shortest-path queries on complex networks: experiments, analyses, and improvement. VLDB, Vol. 15, 11 (2022), 2640--2652.
[67]
Shiqi Zhang, Renchi Yang, Jing Tang, Xiaokui Xiao, and Bo Tang. 2023. Efficient Approximation Algorithms for Spanning Centrality. In KDD. ACM, 3386--3395.

Index Terms

  1. Efficient and Provable Effective Resistance Computation on Large Graphs: An Index-based Approach

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image Proceedings of the ACM on Management of Data
      Proceedings of the ACM on Management of Data  Volume 2, Issue 3
      SIGMOD
      June 2024
      1953 pages
      EISSN:2836-6573
      DOI:10.1145/3670010
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 30 May 2024
      Published in PACMMOD Volume 2, Issue 3

      Permissions

      Request permissions for this article.

      Author Tags

      1. approximate algorithm
      2. effective resistance
      3. graph proximity

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 158
        Total Downloads
      • Downloads (Last 12 months)158
      • Downloads (Last 6 weeks)20
      Reflects downloads up to 27 Jan 2025

      Other Metrics

      Citations

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media