Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1995896.1995909acmconferencesArticle/Chapter ViewAbstractPublication PagesicsConference Proceedingsconference-collections
research-article

Generic topology mapping strategies for large-scale parallel architectures

Published: 31 May 2011 Publication History

Abstract

The steadily increasing number of nodes in high-performance computing systems and the technology and power constraints lead to sparse network topologies. Efficient mapping of application communication patterns to the network topology gains importance as systems grow to petascale and beyond. Such mapping is supported in parallel programming frameworks such as MPI, but is often not well implemented. We show that the topology mapping problem is NP-complete and analyze and compare different practical topology mapping heuristics. We demonstrate an efficient and fast new heuristic which is based on graph similarity and show its utility with application communication patterns on real topologies. Our mapping strategies support heterogeneous networks and show significant reduction of congestion on torus, fat-tree, and the PERCS network topologies, for irregular communication patterns. We also demonstrate that the benefit of topology mapping grows with the network size and show how our algorithms can be used in a practical setting to optimize communication performance. Our efficient topology mapping strategies are shown to reduce network congestion by up to 80%, reduce average dilation by up to 50%, and improve benchmarked communication performance by 18%.

References

[1]
B. Arimilli, R. Arimilli, V. Chung, S. Clark, W. Denzel, B. Drerup, T. Hoefler, J. Joyner, J. Lewis, J. Li, N. Ni, and R. Rajamony. The PERCS High-Performance Interconnect. In Proc. of 18th Symposium on High-Performance Interconnects (HotI'10), Aug. 2010.
[2]
A. Bhatelé, L. V. Kalé, and S. Kumar. Dynamic topology aware load balancing algorithms for molecular dynamics applications. In ICS '09, pages 110--116, New York, NY, USA, 2009. ACM.
[3]
S. H. Bokhari. On the mapping problem. IEEE Trans. Comput., 30(3):207--214, 1981.
[4]
S. W. Bollinger and S. F. Midkiff. Heuristic technique for processor and link assignment in multicomputers. IEEE Trans. Comput., 40(3):325--333, 1991.
[5]
U. Brandes. A faster algorithm for betweenness centrality. The Journal of Math. Sociology, 25(2):163--177, 2001.
[6]
E. Cuthill and J. McKee. Reducing the bandwidth of sparse symmetric matrices. In Proceedings of the 1969 24th national conference, ACM '69, pages 157--172, New York, NY, USA, 1969. ACM.
[7]
T. A. Davis. University of Florida Sparse Matrix Collection. NA Digest, 92, 1994.
[8]
J. Dongarra, I. Foster, G. Fox, W. Gropp, K. Kennedy, L. Torczon, and A. White, editors. Sourcebook of parallel computing. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2003.
[9]
G. Dueck and T. Scheuer. Threshold accepting: a general purpose optimization algorithm appearing superior to simulated annealing. J. Comput. Phys., 90(1):161--175, 1990.
[10]
M. Gary and D. Johnson. Computers and Intractability: A Guide to NP-Completeness. New York: W H. Freeman and Company, 1979.
[11]
J. R. Gilbert, S. Reinhardt, and V. B. Shah. High-performance graph algorithms from parallel sparse matrices. In PARA'06: Proceedings of the 8th international conference on Applied parallel computing, pages 260--269, 2007.
[12]
T. Hoefler, R. Rabenseifner, H. Ritzdorf, B. R. de Supinski, R. Thakur, and J. L. Traeff. The Scalable Process Topology Interface of MPI 2.2. Concurrency and Computation: Practice and Experience, 23(4):293--310, Aug. 2010.
[13]
R. Johari and D. Tan. End-to-end congestion control for the internet: delays and stability. Networking, IEEE/ACM Transactions on, 9(6):818 --832, Dec. 2001.
[14]
P. Kogge et al. Exascale computing study: Technology challenges in achieving exascale systems. DARPA Information Processing Techniques Office, Washington, DC, 2008.
[15]
S.-Y. Lee and J. K. Aggarwal. A mapping strategy for parallel processing. IEEE Trans. Comput., 36(4):433--442, 1987.
[16]
MPI Forum. fMPI: A Message-Passing Interface Standard. Version 2.2, June 23rd 2009. www.mpi-forum.org.
[17]
D. Pekurovsky. P3DFFT - Highly scalable parallel 3D Fast Fourier Transforms library. Technical report, 2010.
[18]
F. Pellegrini and J. Roman. Scotch: A software package for static mapping by dual recursive bipartitioning of process and architecture graphs. In HPCN Europe'96, pages 493--498, 1996.
[19]
A. L. Rosenberg. Issues in the study of graph embeddings. In WG'80, pages 150--176, London, UK, 1981.
[20]
K. Schloegel, G. Karypis, and V. Kumar. Parallel static and dynamic multi-constraint graph partitioning. Concurrency and Computation: Practice and Experience, 14(3):219--240, 2002.
[21]
H. D. Simon and S.-H. Teng. How good is recursive bisection? SIAM J. Sci. Comput., 18:1436--1445, September 1997.
[22]
J. L. Träff. Implementing the MPI process topology mechanism. In Supercomputing '02: Proceedings of the 2002 ACM/IEEE conference on Supercomputing, pages 1--14, 2002.
[23]
H. Yu, I.-H. Chung, and J. Moreira. Topology mapping for Blue Gene/L supercomputer. In SC'06, page 116, New York, NY, USA, 2006. ACM.

Cited By

View all
  • (2024)BoostN: Optimizing Imbalanced Neighborhood Communication on Homogeneous Many-Core SystemProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673131(262-272)Online publication date: 12-Aug-2024
  • (2024)ACES: Accelerating Sparse Matrix Multiplication with Adaptive Execution Flow and Concurrency-Aware Cache OptimizationsProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3620666.3651381(71-85)Online publication date: 27-Apr-2024
  • (2024)Dynamic Formation of Robot Movement Route in Nondeterministic Environment with Bypassing Stationary and Nonstationary ObstaclesPattern Recognition and Image Analysis10.1134/S105466182470033034:3(543-548)Online publication date: 17-Oct-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICS '11: Proceedings of the international conference on Supercomputing
May 2011
398 pages
ISBN:9781450301022
DOI:10.1145/1995896
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 May 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. mpi graph topologies
  2. topology mapping

Qualifiers

  • Research-article

Conference

ICS '11
Sponsor:
ICS '11: International Conference on Supercomputing
May 31 - June 4, 2011
Arizona, Tucson, USA

Acceptance Rates

Overall Acceptance Rate 629 of 2,180 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)62
  • Downloads (Last 6 weeks)5
Reflects downloads up to 25 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)BoostN: Optimizing Imbalanced Neighborhood Communication on Homogeneous Many-Core SystemProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673131(262-272)Online publication date: 12-Aug-2024
  • (2024)ACES: Accelerating Sparse Matrix Multiplication with Adaptive Execution Flow and Concurrency-Aware Cache OptimizationsProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3620666.3651381(71-85)Online publication date: 27-Apr-2024
  • (2024)Dynamic Formation of Robot Movement Route in Nondeterministic Environment with Bypassing Stationary and Nonstationary ObstaclesPattern Recognition and Image Analysis10.1134/S105466182470033034:3(543-548)Online publication date: 17-Oct-2024
  • (2024)Network-Centered Resource Management for HPC Networks2024 IEEE 10th International Conference on Network Softwarization (NetSoft)10.1109/NetSoft60951.2024.10588913(235-238)Online publication date: 24-Jun-2024
  • (2024)Improved Parallel Application Performance and Makespan by Colocation and Topology-aware Process Mapping2024 IEEE 24th International Symposium on Cluster, Cloud and Internet Computing (CCGrid)10.1109/CCGrid59990.2024.00023(119-124)Online publication date: 6-May-2024
  • (2024)MPI task mapping for multi-cluster HPC systemsE3S Web of Conferences10.1051/e3sconf/202454803006548(03006)Online publication date: 12-Jul-2024
  • (2024)Embedding hypercubes into torus and Cartesian product of paths and/or cycles for minimizing wirelengthJournal of Computer and System Sciences10.1016/j.jcss.2024.103603(103603)Online publication date: Nov-2024
  • (2024)On Graphs Embeddable in a Layer of a Hypercube and Their Extremal NumbersAnnals of Combinatorics10.1007/s00026-024-00705-228:4(1257-1283)Online publication date: 29-Jul-2024
  • (2023)Using Mixed-Radix Decomposition to Enumerate Computational Resources of Deeply Hierarchical ArchitecturesProceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624109(405-415)Online publication date: 12-Nov-2023
  • (2023)Spada: Accelerating Sparse Matrix Multiplication with Adaptive DataflowProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3575693.3575706(747-761)Online publication date: 27-Jan-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media