Nearly linear time algorithm for mean hitting times of random walks on a graph

Z Zhang, W Xu, Z Zhang - … of the 13th International Conference on Web …, 2020 - dl.acm.org
Proceedings of the 13th International Conference on Web Search and Data Mining, 2020dl.acm.org
For random walks on a graph, the mean hitting time H_j from a vertex i chosen from the
stationary distribution to the target vertex j can be used as a measure of importance for
vertex j, while the Kemeny constant K is the mean hitting time from a vertex i to a vertex j
selected randomly according to the stationary distribution. Both quantities have found a
large variety of applications in different areas. However, their high computational complexity
limits their applications, especially for large networks with millions of vertices. In this paper …
For random walks on a graph, the mean hitting time from a vertex i chosen from the stationary distribution to the target vertex j can be used as a measure of importance for vertex j, while the Kemeny constant K is the mean hitting time from a vertex i to a vertex j selected randomly according to the stationary distribution. Both quantities have found a large variety of applications in different areas. However, their high computational complexity limits their applications, especially for large networks with millions of vertices. In this paper, we first establish a connection between the two quantities, representing K in terms of for all vertices. We then express both quantities in terms of quadratic forms of the pseudoinverse for graph Laplacian, based on which we develop an efficient algorithm that provides an approximation of for all vertices and K in nearly linear time with respect to the edge number, with high probability. Extensive experiment results on real-life and model networks validate both the efficiency and accuracy of the proposed algorithm.
ACM Digital Library