Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2063384.2063469acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

A scalable eigensolver for large scale-free graphs using 2D graph partitioning

Published: 12 November 2011 Publication History
  • Get Citation Alerts
  • Abstract

    Eigensolvers are important tools for analyzing and mining useful information from scale-free graphs. Such graphs are used in many applications and can be extremely large. Unfortunately, existing parallel eigensolvers do not scale well for these graphs due to the high communication overhead in the parallel matrix-vector multiplication (MatVec). We develop a MatVec algorithm based on 2D edge partitioning that significantly reduces the communication costs and embed it into a popular eigensolver library. We demonstrate that the enhanced eigensolver can attain two orders of magnitude performance improvement compared to the original on a state-of-art massively parallel machine. We illustrate the performance of the embedded MatVec by computing eigenvalues of a scale-free graph with 300 million vertices and 5 billion edges, the largest scale-free graph analyzed by any in-memory parallel eigensolver, to the best of our knowledge.

    References

    [1]
    A. Abou-rjeili and G. Karypis. Multilevel algorithms for partitioning power-law graphs. In Proceedings, IEEE International Parallel & Distributed Processing Symposium (IPDPS), pages 16--575, 2006.
    [2]
    L. Backstrom, D. Huttenlocher, J. Kleinberg, and X. Lan. Group formation in large social networks: membership, growth, and evolution. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '06, pages 44--54, New York, NY, USA, 2006. ACM.
    [3]
    C. G. Baker, U. L. Hetmaniuk, R. B. Lehoucq, and H. K. Thornquist. Anasazi software for the numerical solution of large-scale eigenvalue problems. ACM Trans. Math. Softw., 36:13:1--13:23, July 2009.
    [4]
    A.-L. Barabasi and R. Albert. Emergence of scaling in random networks. Science, 286:509, 1999.
    [5]
    S. T. Barnard. Pmrsb: parallel multilevel recursive spectral bisection. In Supercomputing '95: Proceedings of the 1995 ACM/IEEE conference on Supercomputing (CDROM), page 27, New York, NY, USA, 1995. ACM.
    [6]
    R. Bell, A. D. Malony, and S. Shende. Paraprof: A portable, extensible, and scalable tool for parallel performance profile analysis. In Euro-Par'03, pages 17--26, 2003.
    [7]
    M. J. Berger and S. H. Bokhari. A partitioning strategy for nonuniform problems on multiprocessors. IEEE Trans. Comput., 36(5):570--580, 1987.
    [8]
    P. Boldi and S. Vigna. The WebGraph framework I: Compression techniques. In Proc. of the Thirteenth International World Wide Web Conference (WWW 2004), pages 595--601, Manhattan, USA, 2004. ACM Press.
    [9]
    J. Bradley, D. de Jager, W. Knottenbelt, and A. Trifunović. Hypergraph partitioning for faster parallel pagerank computation. In M. Bravetti, L. Kloul, and G. Zavattaro, editors, Formal Techniques for Computer Systems and Business Processes, volume 3670 of Lecture Notes in Computer Science, pages 155--171. Springer Berlin/Heidelberg, 2005.
    [10]
    A. Broder, R. Kumar, F. Maghoul, P. Raghavan, S. Rajagopalan, R. Stata, and A. Tomkins. Graph structure in the web: Experiments and models. In 9th World Wide Web Conference, 2000.
    [11]
    H. Brunst, H.-C. Hoppe, W. E. Nagel, and M. Winkler. Performance optimization for large scale computing: the scalable VAMPIR approach. In ICCS '01: Proceedings of the International Conference on Computational Science-Part II, pages 751--760, 2001.
    [12]
    T. N. Bui and C. Jones. A heuristic for reducing fill-in in sparse matrix factorization. In PPSC, pages 445--452, 1993.
    [13]
    U. Catalyurek and C. Aykanat. A fine-grain hypergraph model for 2d decomposition of sparse matrices. In Proceedings of the 15th International Parallel & Distributed Processing Symposium, IPDPS '01, pages 118--, Washington, DC, USA, 2001. IEEE Computer Society.
    [14]
    U. Catalyurek and C. Aykanat. A hypergraph-partitioning approach for coarse-grain decomposition. In Proceedings of the 2001 ACM/IEEE conference on Supercomputing (CDROM), Supercomputing '01, pages 28--28, New York, NY, USA, 2001. ACM.
    [15]
    U. V. Çatalyürek and C. Aykanat. Hypergraph-partitioning based decomposition for parallel sparse-matrix vector multiplication. IEEE Trans. on Parallel and Distributed Systems, 10(7):673--693, 1999.
    [16]
    D. Chakrabarti, Y. Zhan, and C. Faloutsos. R-mat: A recursive model for graph mining. In In SDM, 2004.
    [17]
    J. Cho, H. Garcia-Molina, T. Haveliwala, W. Lam, A. Paepcke, S. Raghavan, and G. Wesley. Stanford webbase components and applications. ACM Trans. Internet Technol., 6:153--186, May 2006.
    [18]
    A. Clauset, M. E. J. Newman, and C. Moore. Finding community structure in very large networks. Phys. Rev. E, 70(6):066111, Dec. 2004.
    [19]
    R. Cooley, B. Mobasher, and J. Srivastava. Web mining: Information and pattern discovery on the world wide web. Tools with Artificial Intelligence, IEEE International Conference on, 0:0558, 1997.
    [20]
    J. Duch and A. Arenas. Community detection in complex networks using extremal optimization. Physical Review E, 72:027104, Jan. 2005.
    [21]
    S. Dutt. New faster kernighan-lin-type graph-partitioning algorithms. In ICCAD '93: Proceedings of the 1993 IEEE/ACM international conference on Computer-aided design, pages 370--377, Los Alamitos, CA, USA, 1993. IEEE Computer Society Press.
    [22]
    C. Farhat and M. Lesoinne. Automatic partitioning of unstructured meshes for the parallel solution of problems in computational mechanics. Internat. J. Numer. Meth. Engrg, 36(5):745--764, 1993.
    [23]
    C. M. Fiduccia and R. M. Mattheyses. A linear-time heuristic for improving network partitions. In 25 years of DAC: Papers on Twenty-five years of electronic design automation, pages 241--247, New York, NY, USA, 1988. ACM.
    [24]
    G. Fox et al. Solving Problems on Concurrent Processors. Prentice-Hall, 1988.
    [25]
    J. R. Gilbert and E. Zmijewski. A parallel graph partitioning algorithm for a message-passing multiprocessor. Int. J. Parallel Program., 16(6):427--449, 1987.
    [26]
    D. Gleich, L. Zhukov, and P. Berkhin. Fast parallel pagerank: A linear system approach. Technical report, Institute for Computation and Mathematical Enginneering, Stanford University, 2004.
    [27]
    A. Grama, V. Kumar, and A. Sameh. Parallel matrix-vector product using approximate hierarchical methods. In In Proceedings of Supercomputing '95, 1995.
    [28]
    C. Groër, B. D. Sullivan, and S. Poole. A mathematical analysis of the R-MAT random graph generator. Networks, 2011.
    [29]
    B. Hendrickson and T. G. Kolda. Graph partitioning models for parallel computing. Parallel Computing, 26:1519--1534, 1999.
    [30]
    B. Hendrickson and R. Leland. The Chaco User's Guide, version 2.0. Technical Report SAND95--2344, Sandia National Laboratories, 1995.
    [31]
    B. Hendrickson and R. Leland. A multilevel algorithm for partitioning graphs. In Supercomputing '95: Proceedings of the 1995 ACM/IEEE conference on Supercomputing (CDROM), page 28, New York, NY, USA, 1995. ACM.
    [32]
    B. Hendrickson, R. Leland, and S. Plimpton. An efficient parallel algorithm for matrix-vector multiplication. Int. Journal of High Speed Computing, 7(1):73--88, 1995.
    [33]
    V. Hernandez, J. E. Roman, and V. Vidal. Slepc: A scalable and flexible toolkit for the solution of eigenvalue problems. ACM Trans. Math. Softw., 31(3):351--362, 2005.
    [34]
    Hyperion. https://hyperionproject.llnl.gov.
    [35]
    IBM Blue Gene/P. www-03.ibm.com/systems/deepcomputing/solutions/bluegene.
    [36]
    Y. Ji, X. Xu, and G. D. Stormo. A graph theoretical approach for predicting common RNA secondary structure motifs including pseudoknots in unaligned sequences. Bioinformatics, 20(10):1603--1611, 2004.
    [37]
    M. Jones and P. Plassman. Computational results for parallel unstructured mesh computations. Technical Report UT-CS-94-248, Computer Science Department, University of Tennesse, 1994.
    [38]
    G. Karypis and V. Kumar. A fast and high quality multilevel scheme for partitioning irregular graphs. Technical Report 95--035, University of Minnesota, Dept. of Computer Science, 1995.
    [39]
    G. Karypis and V. Kumar. MeTis: Unstrctured Graph Partitioning and Sparse Matrix Ordering System, Version 2.0, 1995.
    [40]
    G. Karypis and V. Kumar. Multilevel k-way partitioning scheme for irregular graphs. Technical Report 95--064, University of Minnesota, Dept. of Computer Science, 1995.
    [41]
    G. Karypis and V. Kumar. Parallel multilevel k-way partitioning scheme for irregular graphs. In Supercomputing '96: Proceedings of the 1996 ACM/IEEE conference on Supercomputing (CDROM), page 35, Washington, DC, USA, 1996. IEEE Computer Society.
    [42]
    G. Karypis and V. Kumar. A coarse-grain parallel formulation of multilevel k-way graph partitioning algorithm. In PPSC, 1997.
    [43]
    G. Karypis and V. Kumar. A parallel algorithm for multilevel graph partitioning and sparse matrix ordering. J. Parallel Distrib. Comput., 48(1):71--95, 1998.
    [44]
    B. Kernighan and S. Lin. An efficient heuristics for partitioning graphs. Technical report, The Bell System Technical Journal, 1970.
    [45]
    R. Kosala and H. Blockeel. Web mining research: a survey. SIGKDD Explor. Newsl., 2:1--15, June 2000.
    [46]
    R. Leland and B. Hendrickson. An emperical study of static load balancing algorithms. In Scalable High-Performance Comput. Conf., pages 682--685, 1994.
    [47]
    J. G. Lewis and R. A. van de Geijn. Distributed memory matrix-vector multiplication and conjugate gradient algorithms. In IEEE, editor, Proceedings, Supercomputing '93: Portland, Oregon, November 15--19, 1993, pages 484--492, pub-IEEE:adr, 1993. IEEE Computer Society Press.
    [48]
    D. Liben-Nowell and J. Kleinberg. The link-prediction problem for social networks. J. Am. Soc. Inf. Sci. Technol., 58:1019--1031, May 2007.
    [49]
    K. Maschhoff and D. Sorensen. P_ARPACK: An efficient portable large scale eigenvalue package for distributed memory parallel architectures. In J. Wasniewski, J. Dongarra, K. Madsen, and D. Olesen, editors, Applied Parallel Computing Industrial Computation and Optimization, volume 1184 of Lecture Notes in Computer Science, pages 478--486. Springer Berlin/Heidelberg, 1996.
    [50]
    E. Nabieva, K. Jim, A. Agarwal, B. Chazelle, and M. Singh. Whole-proteome prediction of protein function via graph-theoretic analysis of interaction maps. In ISMB (Supplement of Bioinformatics), pages 302--310, 2005.
    [51]
    M. E. J. Newman. Detecting community structure in networks. European Physical Journal B, 38:321--330, May 2004.
    [52]
    M. E. J. Newman. Fast algorithm for detecting community structure in networks. Phys. Rev. E, 69(6):066133, June 2004.
    [53]
    M. E. J. Newman and M. Girvan. Finding and evaluating community structure in networks. Phys. Rev. E, 69(2):026113, Feb. 2004.
    [54]
    F. Pellegrini. Software package and libraries for sequential and parallel graph partitioning, static mapping, and sparse matrix block ordering, and sequential mesh and hypergraph partitioning. http://www.labri.fr/perso/pelegrin/scotch/.
    [55]
    A. Pinar and M. T. Heath. Improving performance of sparse matrix-vector multiplication. In Proceedings of the 1999 ACM/IEEE conference on Supercomputing (CDROM), Supercomputing '99, 1999.
    [56]
    Portable, Extensible Toolkit for Scientific Computation. http://www.mcs.anl.gov/petsc/petsc-as.
    [57]
    A. Pothen, H. D. Simon, and K.-P. Liou. Partitioning sparse matrices with eigenvectors of graphs. SIAM J. Matrix Anal. Appl., 11(3):430--452, 1990.
    [58]
    Y. Saad. Iterative Methods for Sparse Linear Systems. Society for Industrial and Applied Mathematics, 2nd edition, 2003.
    [59]
    A. Schenker. Graph-theoretic techniques for web content mining. PhD thesis, Tampa, FL, USA, 2003. AAI3182715.
    [60]
    J. Scott. Social Network Analysis: A Handbook. SAGE Publications, London, UK, 1991.
    [61]
    H. D. Simon. Partitioning of unstructured problems for parallel processing. Computing Systems in Engineering, 2:135--148, 1991.
    [62]
    A. Stathopoulos and J. R. McCombs. PRIMME: PReconditioned Iterative MultiMethod Eigensolver: Methods and software description. ACM Trans. Math. Software, 37(2):21:1--21:30, 2010.
    [63]
    The Graph500. http://www.graph500.org.
    [64]
    B. Vastenhouw and R. H. Bisseling. A two-dimensional data distribution method for parallel sparse matrix-vector multiplication. SIAM Rev., 47:67--95, January 2005.
    [65]
    S. Wasserman and K. Faust. Social network analysis: methods and applications. Cambridge University Press, 1994.
    [66]
    S. Williams, L. Oliker, R. Vuduc, J. Shalf, K. Yelick, and J. Demmel. Optimization of sparse matrix-vector multiplication on emerging multicore platforms. Parallel Computing, 35(3):178--194, 2009.
    [67]
    A. Yoo, E. Chow, K. Henderson, W. McLendon, B. Hendrickson, and Ümit Çatalyürek. A scalable distributed parallel breadth-first search algorithm on bluegene/l. In Proceedings of Supercomputing'05, Nov. 2005.
    [68]
    A. Yoo and K. Henderson. Parallel massive scale-free graph generators, 2010. http://arxiv.org/pdf/1003.3684v1.
    [69]
    Z. Zohan, K. Mathur, S. Johnson, and T. Hughes. An efficient communication strategy for finite element methods on the connection machine cm-5 system. Technical Report TR-11-93, Parallel Computing Research Group, Center for Research in Computing Technology, Harvard University, 1993.
    [70]
    Z. Zohan, K. Mathur, S. Johnson, and T. Hughes. Parallel implementation of recursive spectral bisection on the connection machine cm-5 system. Technical Report TR-07-94, Parallel Computing Research Group, Center for Research in Computing Technology, Harvard University, 1994.

    Cited By

    View all
    • (2023)Optimizing Graph Partition by Optimal Vertex-Cut: A Holistic Approach2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00083(1019-1031)Online publication date: Apr-2023
    • (2023)A sparse matrix vector multiplication accelerator based on high-bandwidth memoryComputers and Electrical Engineering10.1016/j.compeleceng.2022.108488105:COnline publication date: 1-Jan-2023
    • (2022)A High-performance SpMV Accelerator on HBM-equipped FPGAs2022 IEEE 24th Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys)10.1109/HPCC-DSS-SmartCity-DependSys57074.2022.00171(1081-1087)Online publication date: Dec-2022
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SC '11: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
    November 2011
    866 pages
    ISBN:9781450307710
    DOI:10.1145/2063384
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 November 2011

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Research-article

    Conference

    SC '11
    Sponsor:

    Acceptance Rates

    SC '11 Paper Acceptance Rate 74 of 352 submissions, 21%;
    Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)14
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 29 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Optimizing Graph Partition by Optimal Vertex-Cut: A Holistic Approach2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00083(1019-1031)Online publication date: Apr-2023
    • (2023)A sparse matrix vector multiplication accelerator based on high-bandwidth memoryComputers and Electrical Engineering10.1016/j.compeleceng.2022.108488105:COnline publication date: 1-Jan-2023
    • (2022)A High-performance SpMV Accelerator on HBM-equipped FPGAs2022 IEEE 24th Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys)10.1109/HPCC-DSS-SmartCity-DependSys57074.2022.00171(1081-1087)Online publication date: Dec-2022
    • (2021)Planetary Normal Mode Computation: Parallel Algorithms, Performance, and ReproducibilityIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2021.305044832:11(2609-2622)Online publication date: 1-Nov-2021
    • (2021)SpaceA: Sparse Matrix Vector Multiplication on Processing-in-Memory Accelerator2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA51647.2021.00055(570-583)Online publication date: Feb-2021
    • (2021)Phoenix: A Scalable Streaming Hypergraph Analysis FrameworkAdvances in Data Science and Information Engineering10.1007/978-3-030-71704-9_1(3-25)Online publication date: 30-Oct-2021
    • (2019)Distributed edge partitioning for trillion-edge graphsProceedings of the VLDB Endowment10.14778/3358701.335870612:13(2379-2392)Online publication date: 1-Sep-2019
    • (2019)Incremental Graph Processing for On-line Analytics2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS.2019.00108(1007-1018)Online publication date: May-2019
    • (2018)Computing planetary interior normal modes with a highly parallel polynomial filtering eigensolverProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.5555/3291656.3291751(1-13)Online publication date: 11-Nov-2018
    • (2018)CVR: efficient vectorization of SpMV on x86 processorsProceedings of the 2018 International Symposium on Code Generation and Optimization10.1145/3168818(149-162)Online publication date: 24-Feb-2018
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media