Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1007/978-3-030-85665-6_11guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

An MPI-based Algorithm for Mapping Complex Networks onto Hierarchical Architectures

Published: 01 September 2021 Publication History

Abstract

Processing massive application graphs on distributed memory systems requires to map the graphs onto the system’s processing elements (PEs). This task becomes all the more important when PEs have non-uniform communication costs or the input is highly irregular. Typically, mapping is addressed using partitioning, in a two-step approach or an integrated one. Parallel partitioning tools do exist; yet, corresponding mapping algorithms or their public implementations all have major sequential parts or other severe scaling limitations.
In this paper, we propose a parallel algorithm that maps graphs onto the PEs of a hierarchical system. Our solution integrates partitioning and mapping; it models the system hierarchy in a concise way as an implicit labeled tree. The vertices of the application graph are labeled as well, and these vertex labels induce the mapping. The mapping optimization follows the basic idea of parallel label propagation, but we tailor the gain computations of label changes to quickly account for the induced communication costs. Our MPI-based code is the first public implementation of a parallel graph mapping algorithm; to this end, we extend the partitioning library ParHIP. To evaluate our algorithm’s implementation, we perform comparative experiments with complex networks in the million- and billion-scale range. In general our mapping tool shows good scalability on up to a few thousand PEs. Compared to other MPI-based competitors, our algorithm achieves the best speed to quality trade-off and our quality results are even better than non-parallel mapping tools.

References

[1]
Aktulga HM, Yang C, Ng EG, Maris P, and Vary JP Kaklamanis C, Papatheodorou T, and Spirakis PG Topology-aware mappings for large-scale eigenvalue problems Euro-Par 2012 Parallel Processing 2012 Heidelberg Springer 830-842
[2]
Angriman E et al. Guidelines for experimental algorithmics: a case study in network analysis Algorithms 2019 12 7 127
[3]
Barabási AL and Albert R Emergence of scaling in random networks Science 1999 286 5439 509-512
[4]
Bhatelé, A., Kalé, L.V., Kumar, S.: Dynamic topology aware load balancing algorithms for molecular dynamics applications. In: Proceedings of the 23rd International Conference on Supercomputing, pp. 110–116. ICS 2009. Association for Computing Machinery, New York, NY, USA (2009)
[5]
Bhatelé, A., Gupta, G.R., Kalé, L.V., Chung, I.: Automated mapping of regular communication graphs on mesh interconnects. In: 2010 International Conference on High Performance Computing, pp. 1–10 (2010)
[6]
Brandfass B, Alrutz T, and Gerhold T Rank reordering for MPI communication optimization Comput. Fluids 2013 80 372-380
[7]
Buluç, A., Meyerhenke, H., Safro, I., Sanders, P., Schulz, C.: Recent advances in graph partitioning. In: Algorithm Engineering - Selected Results and Surveys. Lecture Notes in Computer Science, vol. 9220, pp. 117–158 (2016)
[8]
Chan SY, Ling TC, and Aubanel E The Impact of heterogeneous multi-core clusters on graph partitioning: an empirical study Cluster Comput. 2012 15 3 281-302
[9]
Chen, H., Chen, W., Huang, J., Robert, B., Kuhn, H.: MPIPP: an automatic profile-guided parallel process placement toolset for SMP clusters and multiclusters. In: Proceedings of the 20th Annual International Conference on Supercomputing, pp. 353–360. ICS 2006. Association for Computing Machinery, New York, NY, USA
[10]
Faraj, M.F., van der Grinten, A., Meyerhenke, H., Träff, J.L., Schulz, C.: High-Quality Hierarchical Process Mapping. In: 18th International Symposium on Experimental Algorithms (SEA 2020), vol. 160, pp. 4:1–4:15. Dagstuhl, Germany (2020)
[11]
Funke, D., Lamm, S., Sanders, P., Schulz, C., Strash, D., von Looz, M.: Communication-free massively distributed graph generation. In: 2018 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2018, May 21–May 25 2018, Vancouver, BC, Canada (2018)
[12]
Glantz, R., Meyerhenke, H., Noe, A.: Algorithms for mapping parallel processes onto grid and torus architectures. In: 23rd Euromicro International Conference on Parallel, Distributed and Network-Based Processing, PDP 2015, Turku, Finland, pp. 236–243 (2015)
[13]
Glantz, R., Predari, M., Meyerhenke, H.: Topology-induced enhancement of mappings. CoRR abs/1804.07131 (2018). http://arxiv.org/abs/1804.07131
[14]
Hoefler, T., Snir, M.: Generic topology mapping strategies for large-scale parallel architectures. In: ACM International Conference on Supercomputing (ICS 2011), pp. 75–85. ACM (2011)
[15]
Hoefler, T., Jeannot, E., Mercier, G.: An overview of process mapping techniques and algorithms in high-performance computing. In: High Performance Computing on Complex Environments, pp. 75–94. Wiley, June 2014
[16]
Jeannot, E., Mercier, G., Tessier, F.: Process placement in multicore clusters: algorithmic issues and practical techniques. IEEE Trans. Parallel Distrib. Syst. (99), p. 1 (2013).
[17]
Jeannot E, Mercier G, and Tessier F Process placement in multicore clusters: algorithmic issues and practical techniques IEEE Trans. Parallel Distrib. Syst. 2014 25 4 993-1002
[18]
Khorasani, F., Gupta, R., Bhuyan, L.N.: Scalable SIMD-efficient graph processing on GPUs. In: Proceedings of the 24th International Conference on Parallel Architectures and Compilation Techniques, pp. 39–50. PACT 2015 (2015)
[19]
Kirchbach KV, Schulz C, and Träff JL Better process mapping and sparse quadratic assignment ACM J. Exp. Algorithmics 2020 25 1-19
[20]
Kirmani S, Park J, and Raghavan P An embedded sectioning scheme for multiprocessor topology-aware mapping of irregular applications IJHPCA 2017 31 1 91-103
[21]
Kleinberg, J.: The small-world phenomenon: an algorithmic perspective. In: Proceedings of the Thirty-Second Annual ACM Symposium on Theory of Computing, pp. 163–170. STOC 2000. Association for Computing Machinery, New York, NY, USA (2000)
[22]
Leskovec, J.: Stanford Network Analysis Package (SNAP). http://snap.stanford.edu/index.html
[23]
Low Y, Bickson D, Gonzalez J, Guestrin C, Kyrola A, and Hellerstein JM Distributed graphlab: a framework for machine learning and data mining in the cloud Proc. VLDB Endow. 2012 5 8 716-727
[24]
Meyerhenke H, Sanders P, and Schulz C Parallel graph partitioning for complex networks IEEE Trans. Parallel Distributed Syst. 2017 28 9 2625-2638
[25]
Pellegrini, F.: Static mapping by dual recursive bipartitioning of process and architecture graphs. In: Scalable High-Performance Computing Conference (SHPCC), pp. 486–493. IEEE, May 1994
[26]
Pellegrini, F.: Scotch and libscotch 5.0 user’s guide. Technical report, LaBRI, Université Bordeaux I, December 2007
[27]
Pellegrini, F.: Static mapping of process graphs. In: Graph Partitioning, chap. 5, pp. 115–136. John Wiley & Sons (2011)
[28]
Pellegrini, F.: Scotch and PT-scotch graph partitioning software: an overview. In: Naumann, U., Schenk, O. (eds.) Combinatorial Scientific Computing, pp. 373–406. CRC Press (2012)
[29]
Raghavan UN, Albert R, and Kumara S Near linear time algorithm to detect community structures in large-scale networks Phys. Rev. E 2007 76 3 036106
[30]
Salihoglu, S., Widom, J.: GPS: a graph processing system. In: Scientific and Statistical Database Management. Stanford InfoLab, July 2013
[31]
Sanders, P., Schulz, C.: Distributed evolutionary graph partitioning. In: Proceedings of the Meeting on Algorithm Engineering and Expermiments, pp. 16–29. ALENEX 2012, Society for Industrial and Applied Mathematics, USA (2012)
[32]
Sanders, P., Schulz, C.: Kahip v0.53 - karlsruhe high quality partitioning - user guide. CoRR abs/1311.1714 (2013)
[33]
Schloegel K, Karypis G, and Kumar V Parallel static and dynamic multi-constraint graph partitioning Concurrency Comput. Pract. Experience 2002 14 3 219-240
[34]
Slota, G.M., Rajamanickam, S., Devine, K., Madduri, K.: Partitioning trillion-edge graphs in minutes. In: 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 646–655 (2017)
[35]
Lee, S.-Y., Aggarwal: A mapping strategy for parallel processing. IEEE Trans. Comput. C-36(4), 433–442 (1987)
[36]
Walshaw, C., Cross, M.: JOSTLE: parallel multilevel graph-partitioning software - an overview. In: Magoules, F. (ed.) Mesh Partitioning Techniques and Domain Decomposition Techniques, pp. 27–58. Civil-Comp Ltd. (2007). (Invited chapter)

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
Euro-Par 2021: Parallel Processing: 27th International Conference on Parallel and Distributed Computing, Lisbon, Portugal, September 1–3, 2021, Proceedings
Sep 2021
651 pages
ISBN:978-3-030-85664-9
DOI:10.1007/978-3-030-85665-6

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 01 September 2021

Author Tags

  1. Load balancing
  2. Process mapping
  3. Hierarchical architectures
  4. Parallel label propagation

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 06 Oct 2024

Other Metrics

Citations

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media