Abstract
Recently, there has been a surge of research activity in the area of Link Analysis Ranking, where hyperlink structures are used to determine the relative authority of Web pages. One of the seminal works in this area is that of Kleinberg [15], who proposed the Hits algorithm. In this paper, we undertake a theoretical analysis of the properties of the Hits algorithm on a broad class of random graphs. Working within the framework of Borodin et al.[7], we prove that on this class (a) the Hits algorithm is stable with high probability, and (b) the Hits algorithm is similar to the InDegree heuristic that assigns to each node weight proportional to the number of incoming links. We demonstrate that our results go through for the case that the expected in-degrees of the graph follow a power-law distribution, a situation observed in the actual Web graph [9]. We also study experimentally the similarity between Hits and InDegree, and we investigate the general conditions under which the two algorithms are similar.
Partially supported by the EU under contract 001907 (DELIS) and 33555 (COSIN), and by the Italian MIUR under contract ALGO-NEXT.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Achlioptas, D., McSherry, F.: Fast computation of low rank matrix approximations. In: ACM Symposium on Theory of Computing (STOC) (2001)
Adamic, L.A., Huberman, B.A.: Zipf’s law and the internet. Glottometrics 3, 143–150 (2002)
Aiello, W., Chung, F.R.K., Lu, L.: Random evolution in massive graphs. In: IEEE Symposium on Foundations of Computer Science, pp. 510–519 (2001)
Azar, Y., Fiat, A., Karlin, A., McSherry, F., Saia, J.: Spectral analysis of data. In: Proceedings of the 33rd Symposium on Theory of Computing (STOC 2001), Greece (2001)
Barabasi, A.-L., Albert, R.: Emergence of scaling in random networks. Science 286, 509–512 (1999)
Bianchini, M., Gori, M., Scarselli, F.: Pagerank: A circuital analysis. In: Proceedings of the Eleventh International World Wide Web (WWW) Conference (2002)
Borodin, A., Roberts, G.O., Rosenthal, J.S., Tsaparas, P.: Link Analysis Ranking: Algorithms, Theory, and Experiments. ACM Transactions on Internet Technology 05(1) (2005)
Brin, S., Page, L.: The anatomy of a large-scale hypertextual Web search engine. In: Proceedings of the 7th International World Wide Web Conference, Brisbane, Australia (1998)
Broder, A., Kumar, R., Maghoul, F., Raghavan, P., Rajagopalan, S., Stata, R., Tomikns, A., Wiener, W.: Graph structure in the Web. In: Proceedings of WWW9 (2000)
Chung, F., Lu, L.: Connected components in random graphs with given degree sequences. Annals of Combinatorics 6, 125–145 (2002)
Chung, F., Lu, L.: The average distances in random graphs with given expected degrees. Internet Mathematics 1, 91–114 (2003)
Chung, F., Lu, L., Vu, V.: Eigenvalues of random power law graphs. Annals of Combinatorics 7, 21–33 (2003)
Erdös, P., Rènyi, A.: On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci. 5, 17–61 (1960)
Fagin, R., Kumar, R., Sivakumar, D.: Comparing top k lists. In: Proceedings of the ACM-SIAM Symposium on Discrete Algorithms, SODA (2003)
Kleinberg, J.: Authoritative sources in a hyperlinked environment. In: Proceedings of the Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 668–677 (1998)
Kumar, R., Raghavan, P., Rajagopalan, S., Sivakumar, D., Tomkins, A., Upfal, E.: Stochastic models for the web graph. In: Proceedings of the 41st Annual Symposium on Foundations of Computer Science (2000)
Lee, H.C., Borodin, A.: Perturbation of the hyperlinked environment. In: Proceedings of the Ninth International Computing and Combinatorics Conference (2003)
Lempel, R., Moran, S.: The stochastic approach for link-structure analysis (SALSA) and the TKC effect. In: Proceedings of the 9th International World Wide Web Conference (2000)
Lempel, R., Moran, S.: Rank stability and rank similarity of link-based web ranking algorithms in authority connected graphs. In: Second Workshop on Algorithms and Models for the Web-Graph, WAW 2003 (2003)
Mihail, M., Papadimitriou, C.H.: On the eigenvalue power law. In: Rolim, J.D.P., Vadhan, S.P. (eds.) RANDOM 2002. LNCS, vol. 2483, p. 254. Springer, Heidelberg (2002)
Motwani, R., Raghavan, P.: Randomized Algorithms. Cambridge University Press, Cambridge (1995)
Ng, A.Y., Zheng, A.X., Jordan, M.I.: Link analysis, eigenvectors, and stability. In: Proceedings of the International Joint Conference on Artificial Intelligence, IJCAI (2001)
Stewart, G.W., Sun, J.: Matrix Perturbation Theory. Academic Press, London (1990)
Zipf, G.K.: Human Behavior and the principle of least effort. Addison-Wesley, Reading (1949)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Donato, D., Leonardi, S., Tsaparas, P. (2005). Stability and Similarity of Link Analysis Ranking Algorithms. In: Caires, L., Italiano, G.F., Monteiro, L., Palamidessi, C., Yung, M. (eds) Automata, Languages and Programming. ICALP 2005. Lecture Notes in Computer Science, vol 3580. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11523468_58
Download citation
DOI: https://doi.org/10.1007/11523468_58
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-27580-0
Online ISBN: 978-3-540-31691-6
eBook Packages: Computer ScienceComputer Science (R0)