Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1772690.1772730acmotherconferencesArticle/Chapter ViewAbstractPublication PageswebconfConference Proceedingsconference-collections
research-article

Tracking the random surfer: empirically measured teleportation parameters in PageRank

Published: 26 April 2010 Publication History
  • Get Citation Alerts
  • Abstract

    PageRank computes the importance of each node in a directed graph under a random surfer model governed by a teleportation parameter. Commonly denoted alpha, this parameter models the probability of following an edge inside the graph or, when the graph comes from a network of web pages and links, clicking a link on a web page. We empirically measure the teleportation parameter based on browser toolbar logs and a click trail analysis. For a particular user or machine, such analysis produces a value of alpha. We find that these values nicely fit a Beta distribution with mean edge-following probability between 0.3 and 0.7, depending on the site. Using these distributions, we compute PageRank scores where PageRank is computed with respect to a distribution as the teleportation parameter, rather than a constant teleportation parameter. These new metrics are evaluated on the graph of pages in Wikipedia.

    References

    [1]
    S. Abiteboul, M. Preda, and G. Cobena. Adaptive on-line page importance computation. In Proceedings of the 12th international conference on the World Wide Web, pages 280--290, New York, NY, USA, 2003. ACM Press.
    [2]
    K. Avrachenkov, N. Litvak, and K. S. Pham. Distribution of PageRank mass among principle components of the web. In A. Bonato and F. C. Graham, editors, Proceedings of the 5th Workshop on Algorithms and Models for the Web Graph (WAW2007), volume 4863 of Lecture Notes in Computer Science, pages 16--28. Springer, 2007.
    [3]
    P. Berkhin, U. M. Fayyad, P. Raghavan, and A. Tomkins. User-sensitive PageRank. United States Patent Application 20080010281, January 2008.
    [4]
    P. Boldi. TotalRank: Ranking without damping. In Poster Proceedings of the 14th international conference on the World Wide Web (WWW2005), pages 898--899, 2005.
    [5]
    P. Boldi, M. Santini, and S. Vigna. PageRank as a function of the damping factor. In Proceedings of the 14th international conference on the World Wide Web (WWW2005), Chiba, Japan, 2005. ACM Press.
    [6]
    L. D. Catledge and J. E. Pitkow. Characterizing browsing strategies in the world-wide web. Computer Networks and ISDN Systems, 27(6):1065--1073, 1995.
    [7]
    P. G. Constantine and D. F. Gleich. Using polynomial chaos to compute the influence of multiple random surfers in the PageRank model. In A. Bonato and F. C. Graham, editors, Proceedings of the 5th Workshop on Algorithms and Models for the Web Graph (WAW2007), volume 4863 of Lecture Notes in Computer Science, pages 82--95. Springer, 2007.
    [8]
    P. G. Constantine, D. F. Gleich, and G. Iaccarino. Spectral methods for parameterized matrix equations. arXiv, April 2009.
    [9]
    V. Freschi. Protein function prediction from interaction networks using a random walk ranking algorithm. In Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering (BIBE 2007), pages 42--48. IEEE, October 2007.
    [10]
    D. F. Gleich. Models and Algorithms for PageRank Sensitivity. PhD thesis, Stanford University, September 2009.
    [11]
    D. J. Higham. Google PageRank as mean playing time for pinball on the reverse web. Applied Mathematics Letters, 18(12):1359 -- 1362, December 2005.
    [12]
    B. A. Huberman, P. L. T. Pirolli, J. E. Pitkow, and R. M. Lukose. Strong regularities in World Wide Web surfing. Science, 280(5360):95--97, 1998.
    [13]
    J. Kamps and M. Koolen. Is Wikipedia link structure di fferent? In WSDM '09: Proceedings of the Second ACM International Conference on Web Search and Data Mining, pages 232--241, New York, NY, USA, 2009. ACM.
    [14]
    D. Koschutzki, K. A. Lehmann, L. Peeters, S. Richter, D. Tenfelde-Podehl, and O. Zlotowski. Centrality Indicies, volume 3418 of Lecture Notes in Computer Science, chapter 3, pages 16--61. Springer, 2005.
    [15]
    A. N. Langville and C. D. Meyer. Google's PageRank and Beyond: The Science of Search Engine Rankings. Princeton University Press, 2006.
    [16]
    Y. Liu, B. Gao, T.-Y. Liu, Y. Zhang, Z. Ma, S. He, and H. Li. BrowseRank: letting web users vote for page importance. In SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 451--458, New York, NY, USA, 2008. ACM.
    [17]
    J. C. Miller, G. Rae, F. Schaefer, L. A. Ward, T. LoFaro, and A. Farahat. Modifications of Kleinberg's HITS algorithm using matrix exponentiation and web log records. In SIGIR '01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages 444--445, New York, NY, USA, 2001. ACM.
    [18]
    J. L. Morrison, R. Breitling, D. J. Higham, and D. R. Gilbert. GeneRank: using search engine technology for the analysis of microarray experiments. BMC Bioinformatics, 6(1):233, 2005.
    [19]
    M. A. Najork, H. Zaragoza, and M. J. Taylor. HITS on the web: how does it compare? In Proceedings of the 30th annual international ACM SIGIR conference on Research and Development in information retrieval (SIGIR2007), pages 471--478, New York, NY, USA, 2007. ACM.
    [20]
    R. Ospina and S. L. P. Ferrari. Inflated beta distributions. Statistical Papers, 51(1):111--126, January 2010.
    [21]
    L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank citation ranking: Bringing order to the web. Technical Report 1999-66, Stanford University, November 1999.
    [22]
    J.-Y. Pan, H.-J. Yang, C. Faloutsos, and P. Duygulu. Automatic multimedia cross-modal correlation discovery. In KDD '04: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 653--658, New York, NY, USA, 2004. ACM.
    [23]
    R Development Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2009. ISBN 3-900051-07-0.
    [24]
    D. M. Stasinopoulos and R. A. Rigby. Generalized additive models for location scale and shape (GAMLSS) in R. Journal of Statistical Software, 23(7):1--46, December 2007.
    [25]
    S. Vigna, R. Posenato, M. Santini, and S. Vigna. LAW 1.3.1: Library of algorithms for the webgraph. http://law.dsi.unimi.it/software/docs/, 2008.
    [26]
    M. Wang. A significant improvement to clever algorithm in hyperlinked environment. In Proceedings of the 11th international conference on the World Wide Web (WWW2002), 2002.
    [27]
    R. W. White and S. M. Drucker. Investigating behavioral variability in web search. In Proceedings of the 16th international conference on the World Wide Web (WWW2007), pages 21--30, New York, NY, USA, 2007. ACM Press.
    [28]
    R. S. Wills and I. C. F. Ipsen. Ordinal ranking for Google's PageRank. SIAM Journal on Matrix Analysis and Applications, 30:1677--1696, January 2009.
    [29]
    A. D. Wissner-Gross. Preparation of topical reading lists from the link structure of Wikipedia. In ICALT '06: Proceedings of the Sixth IEEE International Conference on Advanced Learning Technologies, pages 825--829, Washington, DC, USA, 2006. IEEE Computer Society.
    [30]
    G.-R. Xue, H.-J. Zeng, Z. Chen, W.-Y. Ma, H. Zhang, and C.-J. Lu. User access pattern enhanced small web search. In Poster Proceedings of the 12th international conference on the World Wide Web (WWW2003), 2003.
    [31]
    D. Zhou, J. Huang, and B. Scholkopf. Learning from labeled and unlabeled data on a directed graph. In ICML '05: Proceedings of the 22nd International Conference on Machine Learning, pages 1036--1043, New York, NY, USA, 2005. ACM Press.

    Cited By

    View all
    • (2020)From Appearance to EssenceACM Transactions on Intelligent Systems and Technology10.1145/341174911:6(1-24)Online publication date: 11-Sep-2020
    • (2020) CycleRank , or there and back again: personalized relevance scores from cyclic paths on directed graphs Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences10.1098/rspa.2019.0740476:2241Online publication date: 9-Sep-2020
    • (2019)SmartVoteWorld Wide Web10.1007/s11280-018-0629-322:4(1855-1885)Online publication date: 1-Jul-2019
    • Show More Cited By

    Index Terms

    1. Tracking the random surfer: empirically measured teleportation parameters in PageRank

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      WWW '10: Proceedings of the 19th international conference on World wide web
      April 2010
      1407 pages
      ISBN:9781605587998
      DOI:10.1145/1772690

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 26 April 2010

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. PageRank
      2. Wikipedia
      3. click trail analysis
      4. empirical click probability
      5. teleportation parameter
      6. toolbar data

      Qualifiers

      • Research-article

      Conference

      WWW '10
      WWW '10: The 19th International World Wide Web Conference
      April 26 - 30, 2010
      North Carolina, Raleigh, USA

      Acceptance Rates

      Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)12
      • Downloads (Last 6 weeks)0

      Other Metrics

      Citations

      Cited By

      View all
      • (2020)From Appearance to EssenceACM Transactions on Intelligent Systems and Technology10.1145/341174911:6(1-24)Online publication date: 11-Sep-2020
      • (2020) CycleRank , or there and back again: personalized relevance scores from cyclic paths on directed graphs Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences10.1098/rspa.2019.0740476:2241Online publication date: 9-Sep-2020
      • (2019)SmartVoteWorld Wide Web10.1007/s11280-018-0629-322:4(1855-1885)Online publication date: 1-Jul-2019
      • (2018)Auditing the Personalization and Composition of Politically-Related Search Engine Results PagesProceedings of the 2018 World Wide Web Conference10.1145/3178876.3186143(955-965)Online publication date: 10-Apr-2018
      • (2017)Suppressing the Search Engine Manipulation Effect (SEME)Proceedings of the ACM on Human-Computer Interaction10.1145/31346771:CSCW(1-22)Online publication date: 6-Dec-2017
      • (2017)Web service ranking and visualization based on QoS properties and invocation relationships2017 2nd International Conference on Image, Vision and Computing (ICIVC)10.1109/ICIVC.2017.7984653(734-738)Online publication date: Jun-2017
      • (2017)PageRank-based analysis and visualization of ethnic entrepreneurship and innovation2017 2nd International Conference on Image, Vision and Computing (ICIVC)10.1109/ICIVC.2017.7984651(724-728)Online publication date: Jun-2017
      • (2017)GrandBase: generating actionable knowledge from Big DataPSU Research Review10.1108/PRR-01-2017-00051:2(105-126)Online publication date: 14-Aug-2017
      • (2017)SourceVote: Fusing Multi-valued Data via Inter-source AgreementsConceptual Modeling10.1007/978-3-319-69904-2_13(164-172)Online publication date: 21-Oct-2017
      • (2016)Assessing the Navigational Effects of Click Biases and Link Insertion on the WebProceedings of the 27th ACM Conference on Hypertext and Social Media10.1145/2914586.2914594(37-47)Online publication date: 10-Jul-2016
      • Show More Cited By

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      ePub

      View this article in ePub.

      ePub

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media