Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1007/11549970_12guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Hypergraph partitioning for faster parallel pagerank computation

Published: 01 September 2005 Publication History

Abstract

The PageRank algorithm is used by search engines such as Google to order web pages. It uses an iterative numerical method to compute the maximal eigenvector of a transition matrix derived from the web's hyperlink structure and a user-centred model of web-surfing behaviour. As the web has expanded and as demand for user-tailored web page ordering metrics has grown, scalable parallel computation of PageRank has become a focus of considerable research effort.
In this paper, we seek a scalable problem decomposition for parallel PageRank computation, through the use of state-of-the-art hypergraph-based partitioning schemes. These have not been previously applied in this context. We consider both one and two-dimensional hypergraph decomposition models. Exploiting the recent availability of the Parkway 2.1 parallel hypergraph partitioner, we present empirical results on a gigabit PC cluster for three publicly available web graphs. Our results show that hypergraph-based partitioning substantially reduces communication volume over conventional partitioning schemes (by up to three orders of magnitude), while still maintaining computational load balance. They also show a halving of the per-iteration runtime cost when compared to the most effective alternative approach used to date.

References

[1]
L. Page, S. Brin, R. Motwani, and T. Winograd, "The PageRank citation ranking: Bringing order to the web," Tech. Rep. 1999-66, Stanford Univ., November 1999.
[2]
T. H. Haveliwala, "Topic sensitive PageRank: A context-sensitive ranking algorithm for web search," Tech. Rep., Stanford University, March 2003.
[3]
C. Alpert, J.-H. Huang, and A. Kahng, "Recent Directions in Netlist Partitioning," Integration, the VLSI Journal, vol. 19, no. 1-2, pp. 1-81, 1995.
[4]
U. V. Catalyurek and C. Aykanat, "Hypergraph-partitioning-based decomposition for parallel sparse matrix-vector multiplication," IEEE Transactions on Parallel and Distributed Systems, vol. 10, no. 7, pp. 673-693, 1999.
[5]
B. Vastenhouw and R. H. Bisseling, "A Two-Dimensional Data Distribution Method for Parallel Sparse Matrix-Vector Multiplication," SIAM Review, vol. 47, no. 1, pp. 67-95, 2005.
[6]
A. Trifunovic and W. J. Knottenbelt, "Parkway2.0: A Parallel Multilevel Hypergraph Partitioning Tool," in Proc. 19th International Symposium on Computer and Information Sciences (C. Aykanat, T. Dayar, and I. Korpeoglu, eds.), vol. 3280 of Lecture Notes in Computer Science, pp. 789-800, Springer, 2004.
[7]
E. Boman, K. Devine, R. Heaphy, U. Catalyurek, and R. Bisseling, "Parallel hypergraph partitioning for scientific computing," Tech. Rep. SAND05-2796C, Sandia National Laboratories, Albuquerque, NM, April 2005.
[8]
S. D. Kamvar, T. H. Haveliwala, C. D. Manning, and G. H. Golub, "Extrapolation methods for accelerating PageRank computations," in Twelfth International World Wide Web Conference, (Budapest, Hungary), pp. 261-270, ACM, May 2003.
[9]
D. de Jager, "PageRank: Three distributed algorithms," M.Sc. thesis, Department of Computing, Imperial College London, London SW7 2BZ, UK, September 2004.
[10]
A. N. Langville and C. D. Meyer, "Deeper inside PageRank," Internet Mathematics, vol. 1, no. 3, pp. 335-400, 2004.
[11]
T. H. Haveliwala and S. D. Kamvar, "The second eigenvalue of the google matrix," Tech. Rep., Computational Mathematics, Stanford University, March 2003.
[12]
"Google." http://www.google.com/. 20th June 2005.
[13]
S. D. Kamvar, T. H. Haveliwala, C. D. Manning, and G. H. Golub, "Exploiting the block structure of the web for computing PageRank," Stanford database group tech. rep., Computational Mathematics, Stanford University, March 2003.
[14]
D. Gleich, L. Zhukov, and P. Berkhin, "Fast parallel PageRank: A linear system approach," Tech. Rep., Institute for Computation and Mathematical Engineering, Stanford University, 2004.
[15]
U. V. Catalyurek and C. Aykanat, "A Fine-Grain Hypergraph Model for 2D Decomposition of Sparse Matrices," in Proc. 8th International Workshop on Solving Irregularly Structured Problems in Parallel, (San Francisco, USA), April 2001.
[16]
B. Ucar and C. Aykanat, "Encapsulating Multiple Communication-Cost Metrics in Partitioning Sparse Rectangular Matrices for Parallel Matrix-Vector Multiples," SIAM Journal of Scientific Computing, vol. 25, no. 6, pp. 1837-1859, 2004.
[17]
B. A. Hendrickson, "Graph partitioning and parallel solvers: Has the Emperor no clothes," in Proc. Irregular'98, vol. 1457 of LNCS, pp. 218-225, Springer, 1998.
[18]
M. R. Garey and D. S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman and Co., 1979.
[19]
A. Trifunovic and W. Knottenbelt, "A Parallel Algorithm for Multilevel k-way Hypergraph Partitioning," in Proc. 3rd International Symposium on Parallel and Distributed Computing, (University College Cork, Ireland), pp. 114-121, July 2004.
[20]
T. Davis, "University of Florida Sparse Matrix Collection," March 2005. http://www.cise.ufl.edu/research/sparse/matrices.
[21]
"UbiCrawler project." http://webgraph-data.dsi.unimi.it/.

Cited By

View all
  • (2013)Scalable matrix computations on large scale-free graphs using 2D graph partitioningProceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis10.1145/2503210.2503293(1-12)Online publication date: 17-Nov-2013
  • (2011)A scalable eigensolver for large scale-free graphs using 2D graph partitioningProceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/2063384.2063469(1-11)Online publication date: 12-Nov-2011
  • (2009)PageRankProceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory10.1007/978-3-642-04417-5_3(17-28)Online publication date: 3-Sep-2009
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
EPEW'05/WS-FM'05: Proceedings of the 2005 international conference on European Performance Engineering, and Web Services and Formal Methods, international conference on Formal Techniques for Computer Systems and Business Processes
September 2005
349 pages
ISBN:3540287019
  • Editors:
  • Mario Bravetti,
  • Leïla Kloul,
  • Gianluigi Zavattaro

Sponsors

  • CNRS: Centre National De La Rechercue Scientifique
  • Laboratoire PRiSM: Laboratoire PRiSM
  • Université de Versailles Saint-Quentin-en-Yvelines: Université de Versailles Saint-Quentin-en-Yvelines

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 01 September 2005

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 12 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2013)Scalable matrix computations on large scale-free graphs using 2D graph partitioningProceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis10.1145/2503210.2503293(1-12)Online publication date: 17-Nov-2013
  • (2011)A scalable eigensolver for large scale-free graphs using 2D graph partitioningProceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/2063384.2063469(1-11)Online publication date: 12-Nov-2011
  • (2009)PageRankProceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory10.1007/978-3-642-04417-5_3(17-28)Online publication date: 3-Sep-2009
  • (2007)Optimizing web structures using web mining techniquesProceedings of the 8th international conference on Intelligent data engineering and automated learning10.5555/1777942.1778010(653-662)Online publication date: 16-Dec-2007
  • (2006)A web-site-based partitioning technique for reducing preprocessing overhead of parallel PageRank computationProceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing10.5555/1775059.1775187(908-918)Online publication date: 18-Jun-2006

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media