research-article

A scalable eigensolver for large scale-free graphs using 2D graph partitioning

Authors:

Allison H. Baker,

Van Emden HensonAuthors Info & Claims

SC '11: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis

Article No.: 63, Pages 1 - 11

https://doi.org/10.1145/2063384.2063469

Published: 12 November 2011 Publication History

Abstract

Eigensolvers are important tools for analyzing and mining useful information from scale-free graphs. Such graphs are used in many applications and can be extremely large. Unfortunately, existing parallel eigensolvers do not scale well for these graphs due to the high communication overhead in the parallel matrix-vector multiplication (MatVec). We develop a MatVec algorithm based on 2D edge partitioning that significantly reduces the communication costs and embed it into a popular eigensolver library. We demonstrate that the enhanced eigensolver can attain two orders of magnitude performance improvement compared to the original on a state-of-art massively parallel machine. We illustrate the performance of the embedded MatVec by computing eigenvalues of a scale-free graph with 300 million vertices and 5 billion edges, the largest scale-free graph analyzed by any in-memory parallel eigensolver, to the best of our knowledge.

References

[1]

A. Abou-rjeili and G. Karypis. Multilevel algorithms for partitioning power-law graphs. In Proceedings, IEEE International Parallel & Distributed Processing Symposium (IPDPS), pages 16--575, 2006.

Digital Library

[2]

L. Backstrom, D. Huttenlocher, J. Kleinberg, and X. Lan. Group formation in large social networks: membership, growth, and evolution. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '06, pages 44--54, New York, NY, USA, 2006. ACM.

Digital Library

[3]

C. G. Baker, U. L. Hetmaniuk, R. B. Lehoucq, and H. K. Thornquist. Anasazi software for the numerical solution of large-scale eigenvalue problems. ACM Trans. Math. Softw., 36:13:1--13:23, July 2009.

Digital Library

[4]

A.-L. Barabasi and R. Albert. Emergence of scaling in random networks. Science, 286:509, 1999.

[5]

S. T. Barnard. Pmrsb: parallel multilevel recursive spectral bisection. In Supercomputing '95: Proceedings of the 1995 ACM/IEEE conference on Supercomputing (CDROM), page 27, New York, NY, USA, 1995. ACM.

Digital Library

[6]

R. Bell, A. D. Malony, and S. Shende. Paraprof: A portable, extensible, and scalable tool for parallel performance profile analysis. In Euro-Par'03, pages 17--26, 2003.

[7]

M. J. Berger and S. H. Bokhari. A partitioning strategy for nonuniform problems on multiprocessors. IEEE Trans. Comput., 36(5):570--580, 1987.

Digital Library

[8]

P. Boldi and S. Vigna. The WebGraph framework I: Compression techniques. In Proc. of the Thirteenth International World Wide Web Conference (WWW 2004), pages 595--601, Manhattan, USA, 2004. ACM Press.

Digital Library

[9]

J. Bradley, D. de Jager, W. Knottenbelt, and A. Trifunović. Hypergraph partitioning for faster parallel pagerank computation. In M. Bravetti, L. Kloul, and G. Zavattaro, editors, Formal Techniques for Computer Systems and Business Processes, volume 3670 of Lecture Notes in Computer Science, pages 155--171. Springer Berlin/Heidelberg, 2005.

Digital Library

[10]

A. Broder, R. Kumar, F. Maghoul, P. Raghavan, S. Rajagopalan, R. Stata, and A. Tomkins. Graph structure in the web: Experiments and models. In 9th World Wide Web Conference, 2000.

Digital Library

[11]

H. Brunst, H.-C. Hoppe, W. E. Nagel, and M. Winkler. Performance optimization for large scale computing: the scalable VAMPIR approach. In ICCS '01: Proceedings of the International Conference on Computational Science-Part II, pages 751--760, 2001.

Digital Library

[12]

T. N. Bui and C. Jones. A heuristic for reducing fill-in in sparse matrix factorization. In PPSC, pages 445--452, 1993.

[13]

U. Catalyurek and C. Aykanat. A fine-grain hypergraph model for 2d decomposition of sparse matrices. In Proceedings of the 15th International Parallel & Distributed Processing Symposium, IPDPS '01, pages 118--, Washington, DC, USA, 2001. IEEE Computer Society.

Digital Library

[14]

U. Catalyurek and C. Aykanat. A hypergraph-partitioning approach for coarse-grain decomposition. In Proceedings of the 2001 ACM/IEEE conference on Supercomputing (CDROM), Supercomputing '01, pages 28--28, New York, NY, USA, 2001. ACM.

Digital Library

[15]

U. V. Çatalyürek and C. Aykanat. Hypergraph-partitioning based decomposition for parallel sparse-matrix vector multiplication. IEEE Trans. on Parallel and Distributed Systems, 10(7):673--693, 1999.

Digital Library

[16]

D. Chakrabarti, Y. Zhan, and C. Faloutsos. R-mat: A recursive model for graph mining. In In SDM, 2004.

[17]

J. Cho, H. Garcia-Molina, T. Haveliwala, W. Lam, A. Paepcke, S. Raghavan, and G. Wesley. Stanford webbase components and applications. ACM Trans. Internet Technol., 6:153--186, May 2006.

Digital Library

[18]

A. Clauset, M. E. J. Newman, and C. Moore. Finding community structure in very large networks. Phys. Rev. E, 70(6):066111, Dec. 2004.

[19]

R. Cooley, B. Mobasher, and J. Srivastava. Web mining: Information and pattern discovery on the world wide web. Tools with Artificial Intelligence, IEEE International Conference on, 0:0558, 1997.

Digital Library

[20]

J. Duch and A. Arenas. Community detection in complex networks using extremal optimization. Physical Review E, 72:027104, Jan. 2005.

[21]

S. Dutt. New faster kernighan-lin-type graph-partitioning algorithms. In ICCAD '93: Proceedings of the 1993 IEEE/ACM international conference on Computer-aided design, pages 370--377, Los Alamitos, CA, USA, 1993. IEEE Computer Society Press.

Digital Library

[22]

C. Farhat and M. Lesoinne. Automatic partitioning of unstructured meshes for the parallel solution of problems in computational mechanics. Internat. J. Numer. Meth. Engrg, 36(5):745--764, 1993.

[23]

C. M. Fiduccia and R. M. Mattheyses. A linear-time heuristic for improving network partitions. In 25 years of DAC: Papers on Twenty-five years of electronic design automation, pages 241--247, New York, NY, USA, 1988. ACM.

Digital Library

[24]

G. Fox et al. Solving Problems on Concurrent Processors. Prentice-Hall, 1988.

Digital Library

[25]

J. R. Gilbert and E. Zmijewski. A parallel graph partitioning algorithm for a message-passing multiprocessor. Int. J. Parallel Program., 16(6):427--449, 1987.

Digital Library

[26]

D. Gleich, L. Zhukov, and P. Berkhin. Fast parallel pagerank: A linear system approach. Technical report, Institute for Computation and Mathematical Enginneering, Stanford University, 2004.

[27]

A. Grama, V. Kumar, and A. Sameh. Parallel matrix-vector product using approximate hierarchical methods. In In Proceedings of Supercomputing '95, 1995.

Digital Library

[28]

C. Groër, B. D. Sullivan, and S. Poole. A mathematical analysis of the R-MAT random graph generator. Networks, 2011.

Digital Library

[29]

B. Hendrickson and T. G. Kolda. Graph partitioning models for parallel computing. Parallel Computing, 26:1519--1534, 1999.

Digital Library

[30]

B. Hendrickson and R. Leland. The Chaco User's Guide, version 2.0. Technical Report SAND95--2344, Sandia National Laboratories, 1995.

[31]

B. Hendrickson and R. Leland. A multilevel algorithm for partitioning graphs. In Supercomputing '95: Proceedings of the 1995 ACM/IEEE conference on Supercomputing (CDROM), page 28, New York, NY, USA, 1995. ACM.

Digital Library

[32]

B. Hendrickson, R. Leland, and S. Plimpton. An efficient parallel algorithm for matrix-vector multiplication. Int. Journal of High Speed Computing, 7(1):73--88, 1995.

[33]

V. Hernandez, J. E. Roman, and V. Vidal. Slepc: A scalable and flexible toolkit for the solution of eigenvalue problems. ACM Trans. Math. Softw., 31(3):351--362, 2005.

Digital Library

[34]

Hyperion. https://hyperionproject.llnl.gov.

[35]

IBM Blue Gene/P. www-03.ibm.com/systems/deepcomputing/solutions/bluegene.

[36]

Y. Ji, X. Xu, and G. D. Stormo. A graph theoretical approach for predicting common RNA secondary structure motifs including pseudoknots in unaligned sequences. Bioinformatics, 20(10):1603--1611, 2004.

Digital Library

[37]

M. Jones and P. Plassman. Computational results for parallel unstructured mesh computations. Technical Report UT-CS-94-248, Computer Science Department, University of Tennesse, 1994.

Digital Library

[38]

G. Karypis and V. Kumar. A fast and high quality multilevel scheme for partitioning irregular graphs. Technical Report 95--035, University of Minnesota, Dept. of Computer Science, 1995.

[39]

G. Karypis and V. Kumar. MeTis: Unstrctured Graph Partitioning and Sparse Matrix Ordering System, Version 2.0, 1995.

[40]

G. Karypis and V. Kumar. Multilevel k-way partitioning scheme for irregular graphs. Technical Report 95--064, University of Minnesota, Dept. of Computer Science, 1995.

[41]

G. Karypis and V. Kumar. Parallel multilevel k-way partitioning scheme for irregular graphs. In Supercomputing '96: Proceedings of the 1996 ACM/IEEE conference on Supercomputing (CDROM), page 35, Washington, DC, USA, 1996. IEEE Computer Society.

Digital Library

[42]

G. Karypis and V. Kumar. A coarse-grain parallel formulation of multilevel k-way graph partitioning algorithm. In PPSC, 1997.

[43]

G. Karypis and V. Kumar. A parallel algorithm for multilevel graph partitioning and sparse matrix ordering. J. Parallel Distrib. Comput., 48(1):71--95, 1998.

Digital Library

[44]

B. Kernighan and S. Lin. An efficient heuristics for partitioning graphs. Technical report, The Bell System Technical Journal, 1970.

[45]

R. Kosala and H. Blockeel. Web mining research: a survey. SIGKDD Explor. Newsl., 2:1--15, June 2000.

Digital Library

[46]

R. Leland and B. Hendrickson. An emperical study of static load balancing algorithms. In Scalable High-Performance Comput. Conf., pages 682--685, 1994.

[47]

J. G. Lewis and R. A. van de Geijn. Distributed memory matrix-vector multiplication and conjugate gradient algorithms. In IEEE, editor, Proceedings, Supercomputing '93: Portland, Oregon, November 15--19, 1993, pages 484--492, pub-IEEE:adr, 1993. IEEE Computer Society Press.

Digital Library

[48]

D. Liben-Nowell and J. Kleinberg. The link-prediction problem for social networks. J. Am. Soc. Inf. Sci. Technol., 58:1019--1031, May 2007.

Digital Library

[49]

K. Maschhoff and D. Sorensen. P_ARPACK: An efficient portable large scale eigenvalue package for distributed memory parallel architectures. In J. Wasniewski, J. Dongarra, K. Madsen, and D. Olesen, editors, Applied Parallel Computing Industrial Computation and Optimization, volume 1184 of Lecture Notes in Computer Science, pages 478--486. Springer Berlin/Heidelberg, 1996.

Digital Library

[50]

E. Nabieva, K. Jim, A. Agarwal, B. Chazelle, and M. Singh. Whole-proteome prediction of protein function via graph-theoretic analysis of interaction maps. In ISMB (Supplement of Bioinformatics), pages 302--310, 2005.

Digital Library

[51]

M. E. J. Newman. Detecting community structure in networks. European Physical Journal B, 38:321--330, May 2004.

[52]

M. E. J. Newman. Fast algorithm for detecting community structure in networks. Phys. Rev. E, 69(6):066133, June 2004.

[53]

M. E. J. Newman and M. Girvan. Finding and evaluating community structure in networks. Phys. Rev. E, 69(2):026113, Feb. 2004.

[54]

F. Pellegrini. Software package and libraries for sequential and parallel graph partitioning, static mapping, and sparse matrix block ordering, and sequential mesh and hypergraph partitioning. http://www.labri.fr/perso/pelegrin/scotch/.

[55]

A. Pinar and M. T. Heath. Improving performance of sparse matrix-vector multiplication. In Proceedings of the 1999 ACM/IEEE conference on Supercomputing (CDROM), Supercomputing '99, 1999.

Digital Library

[56]

Portable, Extensible Toolkit for Scientific Computation. http://www.mcs.anl.gov/petsc/petsc-as.

[57]

A. Pothen, H. D. Simon, and K.-P. Liou. Partitioning sparse matrices with eigenvectors of graphs. SIAM J. Matrix Anal. Appl., 11(3):430--452, 1990.

Digital Library

[58]

Y. Saad. Iterative Methods for Sparse Linear Systems. Society for Industrial and Applied Mathematics, 2nd edition, 2003.

Digital Library

[59]

A. Schenker. Graph-theoretic techniques for web content mining. PhD thesis, Tampa, FL, USA, 2003. AAI3182715.

Digital Library

[60]

J. Scott. Social Network Analysis: A Handbook. SAGE Publications, London, UK, 1991.

[61]

H. D. Simon. Partitioning of unstructured problems for parallel processing. Computing Systems in Engineering, 2:135--148, 1991.

[62]

A. Stathopoulos and J. R. McCombs. PRIMME: PReconditioned Iterative MultiMethod Eigensolver: Methods and software description. ACM Trans. Math. Software, 37(2):21:1--21:30, 2010.

Digital Library

[63]

The Graph500. http://www.graph500.org.

[64]

B. Vastenhouw and R. H. Bisseling. A two-dimensional data distribution method for parallel sparse matrix-vector multiplication. SIAM Rev., 47:67--95, January 2005.

Digital Library

[65]

S. Wasserman and K. Faust. Social network analysis: methods and applications. Cambridge University Press, 1994.

[66]

S. Williams, L. Oliker, R. Vuduc, J. Shalf, K. Yelick, and J. Demmel. Optimization of sparse matrix-vector multiplication on emerging multicore platforms. Parallel Computing, 35(3):178--194, 2009.

Digital Library

[67]

A. Yoo, E. Chow, K. Henderson, W. McLendon, B. Hendrickson, and Ümit Çatalyürek. A scalable distributed parallel breadth-first search algorithm on bluegene/l. In Proceedings of Supercomputing'05, Nov. 2005.

Digital Library

[68]

A. Yoo and K. Henderson. Parallel massive scale-free graph generators, 2010. http://arxiv.org/pdf/1003.3684v1.

[69]

Z. Zohan, K. Mathur, S. Johnson, and T. Hughes. An efficient communication strategy for finite element methods on the connection machine cm-5 system. Technical Report TR-11-93, Parallel Computing Research Group, Center for Research in Computing Technology, Harvard University, 1993.

[70]

Z. Zohan, K. Mathur, S. Johnson, and T. Hughes. Parallel implementation of recursive spectral bisection on the connection machine cm-5 system. Technical Report TR-07-94, Parallel Computing Research Group, Center for Research in Computing Technology, Harvard University, 1994.

Cited By

Feldmann AGolden CYang YEmer JSanchez D(2024)Azul: An Accelerator for Sparse Iterative Solvers Leveraging Distributed On-Chip Memory2024 57th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO61859.2024.00054(643-656)Online publication date: 2-Nov-2024
https://doi.org/10.1109/MICRO61859.2024.00054
Qu WZhang WCheng JZhang CHan WBai BZhang CHe LWang X(2023)Optimizing Graph Partition by Optimal Vertex-Cut: A Holistic Approach2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00083(1019-1031)Online publication date: Apr-2023
https://doi.org/10.1109/ICDE55515.2023.00083
Li TShen L(2023)A sparse matrix vector multiplication accelerator based on high-bandwidth memoryComputers and Electrical Engineering10.1016/j.compeleceng.2022.108488105:COnline publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1016/j.compeleceng.2022.108488
Show More Cited By

Index Terms

A scalable eigensolver for large scale-free graphs using 2D graph partitioning

Recommendations

Scalable matrix computations on large scale-free graphs using 2D graph partitioning
SC '13: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis

Scalable parallel computing is essential for processing large scale-free (power-law) graphs. The distribution of data across processes becomes important on distributed-memory computers with thousands of cores. It has been shown that two-dimensional ...
Circumference of 3-connected claw-free graphs and large Eulerian subgraphs of 3-edge-connected graphs

The circumference of a graph is the length of its longest cycles. Results of Jackson, and Jackson and Wormald, imply that the circumference of a 3-connected cubic n-vertex graph is @W(n^0^.^6^9^4), and the circumference of a 3-connected claw-free graph ...
Large Induced Forests in Triangle-Free Planar Graphs
Abstract
Given a planar graph G, what is the largest subset of vertices of G that induces a forest? Albertson and Berman [2] conjectured that every planar graph has an induced subgraph on at least half of the vertices that is a forest. For bipartite planar ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SC '11: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis

November 2011

866 pages

ISBN:9781450307710

DOI:10.1145/2063384

Conference Chair:
Scott Lathrop
University of Chicago
,
Program Chairs:
Jim Costa
Sandia National Laboratories
,
William Kramer
National Center for Supercomputing Applications

Copyright © 2011 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGARCH: ACM Special Interest Group on Computer Architecture
IEEE-CS: Computer Society

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 November 2011

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Conference

SC '11

Sponsor:

SIGARCH
IEEE-CS

SC '11: International Conference for High Performance Computing, Networking, Storage and Analysis

November 12 - 18, 2011

Washington, Seattle

Acceptance Rates

SC '11 Paper Acceptance Rate 74 of 352 submissions, 21%;

Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

29
Total Citations
View Citations
393
Total Downloads

Downloads (Last 12 months)11
Downloads (Last 6 weeks)3

Reflects downloads up to 12 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Feldmann AGolden CYang YEmer JSanchez D(2024)Azul: An Accelerator for Sparse Iterative Solvers Leveraging Distributed On-Chip Memory2024 57th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO61859.2024.00054(643-656)Online publication date: 2-Nov-2024
https://doi.org/10.1109/MICRO61859.2024.00054
Qu WZhang WCheng JZhang CHan WBai BZhang CHe LWang X(2023)Optimizing Graph Partition by Optimal Vertex-Cut: A Holistic Approach2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00083(1019-1031)Online publication date: Apr-2023
https://doi.org/10.1109/ICDE55515.2023.00083
Li TShen L(2023)A sparse matrix vector multiplication accelerator based on high-bandwidth memoryComputers and Electrical Engineering10.1016/j.compeleceng.2022.108488105:COnline publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1016/j.compeleceng.2022.108488
Li TShen LYao S(2022)A High-performance SpMV Accelerator on HBM-equipped FPGAs2022 IEEE 24th Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys)10.1109/HPCC-DSS-SmartCity-DependSys57074.2022.00171(1081-1087)Online publication date: Dec-2022
https://doi.org/10.1109/HPCC-DSS-SmartCity-DependSys57074.2022.00171
Shi JLi RXi YSaad Yde Hoop M(2021)Planetary Normal Mode Computation: Parallel Algorithms, Performance, and ReproducibilityIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2021.305044832:11(2609-2622)Online publication date: 1-Nov-2021
https://doi.org/10.1109/TPDS.2021.3050448
Xie XLiang ZGu PBasak ADeng LLiang LHu XXie Y(2021)SpaceA: Sparse Matrix Vector Multiplication on Processing-in-Memory Accelerator2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA51647.2021.00055(570-583)Online publication date: Feb-2021
https://doi.org/10.1109/HPCA51647.2021.00055
Kurte KImam NHasan SKannan R(2021)Phoenix: A Scalable Streaming Hypergraph Analysis FrameworkAdvances in Data Science and Information Engineering10.1007/978-3-030-71704-9_1(3-25)Online publication date: 30-Oct-2021
https://doi.org/10.1007/978-3-030-71704-9_1
Hanai MSuzumura TTan WLiu ETheodoropoulos GCai W(2019)Distributed edge partitioning for trillion-edge graphsProceedings of the VLDB Endowment10.14778/3358701.335870612:13(2379-2392)Online publication date: 1-Sep-2019
https://dl.acm.org/doi/10.14778/3358701.3358706
Sallinen SPearce RRipeanu M(2019)Incremental Graph Processing for On-line Analytics2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS.2019.00108(1007-1018)Online publication date: May-2019
https://doi.org/10.1109/IPDPS.2019.00108
Shi JLi RXi YSaad Yde Hoop M(2018)Computing planetary interior normal modes with a highly parallel polynomial filtering eigensolverProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.5555/3291656.3291751(1-13)Online publication date: 11-Nov-2018
https://dl.acm.org/doi/10.5555/3291656.3291751
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents