Article

An MPI-based Algorithm for Mapping Complex Networks onto Hierarchical Architectures

Authors:

Charilaos Tzovas,

Christian Schulz,

Henning MeyerhenkeAuthors Info & Claims

Euro-Par 2021: Parallel Processing: 27th International Conference on Parallel and Distributed Computing, Lisbon, Portugal, September 1–3, 2021, Proceedings

Pages 167 - 182

https://doi.org/10.1007/978-3-030-85665-6_11

Published: 01 September 2021 Publication History

Abstract

Processing massive application graphs on distributed memory systems requires to map the graphs onto the system’s processing elements (PEs). This task becomes all the more important when PEs have non-uniform communication costs or the input is highly irregular. Typically, mapping is addressed using partitioning, in a two-step approach or an integrated one. Parallel partitioning tools do exist; yet, corresponding mapping algorithms or their public implementations all have major sequential parts or other severe scaling limitations.

In this paper, we propose a parallel algorithm that maps graphs onto the PEs of a hierarchical system. Our solution integrates partitioning and mapping; it models the system hierarchy in a concise way as an implicit labeled tree. The vertices of the application graph are labeled as well, and these vertex labels induce the mapping. The mapping optimization follows the basic idea of parallel label propagation, but we tailor the gain computations of label changes to quickly account for the induced communication costs. Our MPI-based code is the first public implementation of a parallel graph mapping algorithm; to this end, we extend the partitioning library ParHIP. To evaluate our algorithm’s implementation, we perform comparative experiments with complex networks in the million- and billion-scale range. In general our mapping tool shows good scalability on up to a few thousand PEs. Compared to other MPI-based competitors, our algorithm achieves the best speed to quality trade-off and our quality results are even better than non-parallel mapping tools.

References

[1]

Aktulga HM, Yang C, Ng EG, Maris P, and Vary JP Kaklamanis C, Papatheodorou T, and Spirakis PG Topology-aware mappings for large-scale eigenvalue problems Euro-Par 2012 Parallel Processing 2012 Heidelberg Springer 830-842

[2]

Angriman E et al. Guidelines for experimental algorithmics: a case study in network analysis Algorithms 2019 12 7 127

[3]

Barabási AL and Albert R Emergence of scaling in random networks Science 1999 286 5439 509-512

[4]

Bhatelé, A., Kalé, L.V., Kumar, S.: Dynamic topology aware load balancing algorithms for molecular dynamics applications. In: Proceedings of the 23rd International Conference on Supercomputing, pp. 110–116. ICS 2009. Association for Computing Machinery, New York, NY, USA (2009)

[5]

Bhatelé, A., Gupta, G.R., Kalé, L.V., Chung, I.: Automated mapping of regular communication graphs on mesh interconnects. In: 2010 International Conference on High Performance Computing, pp. 1–10 (2010)

[6]

Brandfass B, Alrutz T, and Gerhold T Rank reordering for MPI communication optimization Comput. Fluids 2013 80 372-380

[7]

Buluç, A., Meyerhenke, H., Safro, I., Sanders, P., Schulz, C.: Recent advances in graph partitioning. In: Algorithm Engineering - Selected Results and Surveys. Lecture Notes in Computer Science, vol. 9220, pp. 117–158 (2016)

[8]

Chan SY, Ling TC, and Aubanel E The Impact of heterogeneous multi-core clusters on graph partitioning: an empirical study Cluster Comput. 2012 15 3 281-302

[9]

Chen, H., Chen, W., Huang, J., Robert, B., Kuhn, H.: MPIPP: an automatic profile-guided parallel process placement toolset for SMP clusters and multiclusters. In: Proceedings of the 20th Annual International Conference on Supercomputing, pp. 353–360. ICS 2006. Association for Computing Machinery, New York, NY, USA

[10]

Faraj, M.F., van der Grinten, A., Meyerhenke, H., Träff, J.L., Schulz, C.: High-Quality Hierarchical Process Mapping. In: 18th International Symposium on Experimental Algorithms (SEA 2020), vol. 160, pp. 4:1–4:15. Dagstuhl, Germany (2020)

[11]

Funke, D., Lamm, S., Sanders, P., Schulz, C., Strash, D., von Looz, M.: Communication-free massively distributed graph generation. In: 2018 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2018, May 21–May 25 2018, Vancouver, BC, Canada (2018)

[12]

Glantz, R., Meyerhenke, H., Noe, A.: Algorithms for mapping parallel processes onto grid and torus architectures. In: 23rd Euromicro International Conference on Parallel, Distributed and Network-Based Processing, PDP 2015, Turku, Finland, pp. 236–243 (2015)

[13]

Glantz, R., Predari, M., Meyerhenke, H.: Topology-induced enhancement of mappings. CoRR abs/1804.07131 (2018). http://arxiv.org/abs/1804.07131

[14]

Hoefler, T., Snir, M.: Generic topology mapping strategies for large-scale parallel architectures. In: ACM International Conference on Supercomputing (ICS 2011), pp. 75–85. ACM (2011)

[15]

Hoefler, T., Jeannot, E., Mercier, G.: An overview of process mapping techniques and algorithms in high-performance computing. In: High Performance Computing on Complex Environments, pp. 75–94. Wiley, June 2014

[16]

Jeannot, E., Mercier, G., Tessier, F.: Process placement in multicore clusters: algorithmic issues and practical techniques. IEEE Trans. Parallel Distrib. Syst. (99), p. 1 (2013).

[17]

Jeannot E, Mercier G, and Tessier F Process placement in multicore clusters: algorithmic issues and practical techniques IEEE Trans. Parallel Distrib. Syst. 2014 25 4 993-1002

[18]

Khorasani, F., Gupta, R., Bhuyan, L.N.: Scalable SIMD-efficient graph processing on GPUs. In: Proceedings of the 24th International Conference on Parallel Architectures and Compilation Techniques, pp. 39–50. PACT 2015 (2015)

[19]

Kirchbach KV, Schulz C, and Träff JL Better process mapping and sparse quadratic assignment ACM J. Exp. Algorithmics 2020 25 1-19

[20]

Kirmani S, Park J, and Raghavan P An embedded sectioning scheme for multiprocessor topology-aware mapping of irregular applications IJHPCA 2017 31 1 91-103

[21]

Kleinberg, J.: The small-world phenomenon: an algorithmic perspective. In: Proceedings of the Thirty-Second Annual ACM Symposium on Theory of Computing, pp. 163–170. STOC 2000. Association for Computing Machinery, New York, NY, USA (2000)

[22]

Leskovec, J.: Stanford Network Analysis Package (SNAP). http://snap.stanford.edu/index.html

[23]

Low Y, Bickson D, Gonzalez J, Guestrin C, Kyrola A, and Hellerstein JM Distributed graphlab: a framework for machine learning and data mining in the cloud Proc. VLDB Endow. 2012 5 8 716-727

[24]

Meyerhenke H, Sanders P, and Schulz C Parallel graph partitioning for complex networks IEEE Trans. Parallel Distributed Syst. 2017 28 9 2625-2638

[25]

Pellegrini, F.: Static mapping by dual recursive bipartitioning of process and architecture graphs. In: Scalable High-Performance Computing Conference (SHPCC), pp. 486–493. IEEE, May 1994

[26]

Pellegrini, F.: Scotch and libscotch 5.0 user’s guide. Technical report, LaBRI, Université Bordeaux I, December 2007

[27]

Pellegrini, F.: Static mapping of process graphs. In: Graph Partitioning, chap. 5, pp. 115–136. John Wiley & Sons (2011)

[28]

Pellegrini, F.: Scotch and PT-scotch graph partitioning software: an overview. In: Naumann, U., Schenk, O. (eds.) Combinatorial Scientific Computing, pp. 373–406. CRC Press (2012)

[29]

Raghavan UN, Albert R, and Kumara S Near linear time algorithm to detect community structures in large-scale networks Phys. Rev. E 2007 76 3 036106

[30]

Salihoglu, S., Widom, J.: GPS: a graph processing system. In: Scientific and Statistical Database Management. Stanford InfoLab, July 2013

[31]

Sanders, P., Schulz, C.: Distributed evolutionary graph partitioning. In: Proceedings of the Meeting on Algorithm Engineering and Expermiments, pp. 16–29. ALENEX 2012, Society for Industrial and Applied Mathematics, USA (2012)

[32]

Sanders, P., Schulz, C.: Kahip v0.53 - karlsruhe high quality partitioning - user guide. CoRR abs/1311.1714 (2013)

[33]

Schloegel K, Karypis G, and Kumar V Parallel static and dynamic multi-constraint graph partitioning Concurrency Comput. Pract. Experience 2002 14 3 219-240

[34]

Slota, G.M., Rajamanickam, S., Devine, K., Madduri, K.: Partitioning trillion-edge graphs in minutes. In: 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 646–655 (2017)

[35]

Lee, S.-Y., Aggarwal: A mapping strategy for parallel processing. IEEE Trans. Comput. C-36(4), 433–442 (1987)

[36]

Walshaw, C., Cross, M.: JOSTLE: parallel multilevel graph-partitioning software - an overview. In: Magoules, F. (ed.) Mesh Partitioning Techniques and Domain Decomposition Techniques, pp. 27–58. Civil-Comp Ltd. (2007). (Invited chapter)

Index Terms

An MPI-based Algorithm for Mapping Complex Networks onto Hierarchical Architectures

Index terms have been assigned to the content through auto-classification.

Recommendations

Mapping onto three classes of parallel machines: a case study using the cyclic reduction algorithm
IPPS '93: Proceedings of the 1993 Seventh International Parallel Processing Symposium

Mapping cyclic reduction, a known approach for the parallel solution of tridiagonal systems of equations, onto the MasPar MP-1, nCUBE 2, and PASM parallel machines is discussed. Each of these represents a different mode of parallelism. Issues addressed ...
Mapping parallel programs onto multicore computer systems by Hopfield networks

The problem of mapping a parallel program with weighted vertices (processes) and edges (interprocess exchanges) onto a weighted graph of the distributed computer system is considered. An algorithm for solving this problem based on the use of Hopfield ...
A topology-aware load balancing algorithm for clustered hierarchical multi-core machines

In this paper, we present a topology-aware load balancing algorithm for parallel multi-core machines and its proof of asymptotic convergence to an optimal solution. The algorithm, named HwTopoLB, aims to improve the application performance by reducing ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

Euro-Par 2021: Parallel Processing: 27th International Conference on Parallel and Distributed Computing, Lisbon, Portugal, September 1–3, 2021, Proceedings

Sep 2021

651 pages

ISBN:978-3-030-85664-9

DOI:10.1007/978-3-030-85665-6

Editors:
Leonel Sousa
Universidade de Lisboa, Lisbon, Portugal
,
Nuno Roma
Universidade de Lisboa, Lisbon, Portugal
,
Pedro Tomás
Universidade de Lisboa, Lisbon, Portugal

© Springer Nature Switzerland AG 2021.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 01 September 2021

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 06 Oct 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents