Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/3433701.3433746acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

A hierarchical and load-aware design for large message neighborhood collectives

Published: 09 November 2020 Publication History

Abstract

The MPI-3.0 standard introduced neighborhood collective to support sparse communication patterns used in many applications. In this paper, we propose a hierarchical and distributed graph topology that considers the physical topology of the system and the virtual communication pattern of processes to improve the performance of large message neighborhood collectives. Moreover, we propose two design alternatives on top of the hierarchical design: 1. LAG-H: assumes the same communication load for all processes, 2. LAW-H: considers the communication load of processes for fair distribution of load between them. We propose a mathematical model to determine the communication capacity of each process. Then, we use the derived capacity to fairly distribute the load between processes. Our experimental results on up to 28,672 processes show up to 9x speedup for various process topologies. We also observe up to 8.2% performance gain and 34x speedup for NAS-DT and SpMM, respectively.

References

[1]
"MPI-3 Standard Document," http://www.mpi-forum.org/docs/mpi-3.0/mpi30-report.pdf.
[2]
T. Hoefler and T. Schneider, "Optimization principles for collective neighborhood communications," in Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. IEEE Computer Society Press, 2012.
[3]
J. L. Träff, A. Carpen-Amarie, S. Hunold, and A. Rougier, "Message-combining algorithms for isomorphic, sparse collective communication," arXiv preprint arXiv:1606.07676, 2016.
[4]
J. L. Träff, F. D. Lübbe, A. Rougier, and S. Hunold, "Isomorphic, sparse MPI-like collective communication operations for parallel stencil computations," in Proceedings of the 22nd European MPI Users' Group Meeting. ACM, 2015.
[5]
J. L. Träff and S. Hunold, "Cartesian Collective Communication," in Proceedings of the 48th International Conference on Parallel Processing, ser. ICPP 2019. ACM, 2019.
[6]
S. H. Mirsadeghi, J. L. Träff, P. Balaji, and A. Afsahi, "Exploiting common neighborhoods to optimize MPI neighborhood collectives," in High Performance Computing (HiPC), 2017 IEEE 24th International Conference on. IEEE, 2017.
[7]
S. M. Ghazimirsaeed, S. H. Mirsadeghi, and A. Afsahi, "An Efficient Collaborative Communication Mechanism for MPI Neighborhood Collectives," in 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS), May 2019.
[8]
J. Pjesivac-Grbovic, T. Angskun, G. Bosilca, G. E. Fagg, E. Gabriel, and J. J. Dongarra, "Performance analysis of mpi collective operations," in 19th IEEE International Parallel and Distributed Processing Symposium, April 2005.
[9]
G. Almási, P. Heidelberger, C. J. Archer, X. Martorell, C. C. Erway, J. E. Moreira, B. Steinmacher-Burow, and Y. Zheng, "Optimization of mpi collective communication on bluegene/l systems," in Proceedings of the 19th Annual International Conference on Supercomputing, ser. ICS '05, 2005.
[10]
"MPICH: High-Performance Portable MPI," http://www.mpich.org. Accessed: September 9, 2020.
[11]
Network-Based Computing Laboratory, "MVAPICH: MPI over InfiniBand, Omni-Path, Ethernet/iWARP, and RoCE," http://mvapich.cse.ohio-state.edu/.
[12]
E. Gabriel, G. E. Fagg, G. Bosilca, T. Angskun, J. J. Dongarra, J. M. Squyres, V. Sahay, P. Kambadur, B. Barrett, A. Lumsdaine, R. H. Castain, D. J. Daniel, R. L. Graham, and T. S. Woodall, "Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation," in Proceedings, 11th European PVM/MPI Users' Group Meeting, 2004.
[13]
T. Hoefler and J. L. Traff, "Sparse collective operations for MPI," in Parallel & Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium on. IEEE, 2009.
[14]
M. Bayatpour, S. Chakraborty, H. Subramoni, X. Lu, and D. K. Panda, "Scalable reduction collectives with data partitioning-based multi-leader design," in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2017, pp. 1--11.
[15]
Network-Based Computing Laboratory, "MVAPICH: MPI over Infini-Band, Omni-Path, Ethernet/iWARP, and RoCE," http://mvapich.cse.ohio-state.edu/.
[16]
F. D. Lübbe, "Micro-benchmarking MPI Neighborhood Collective Operations," in European Conference on Parallel Processing. Springer, 2017.
[17]
T. U. of Florida Sparse Matrix Collection, "Davis, Timothy A. and Hu, Yifan," ACM Trans. Math. Softw., vol. 38, no. 1, 2011.
[18]
D. H. Bailey, E. Barszcz, J. T. Barton, D. S. Browning, R. L. Carter, L. Dagum, R. A. Fatoohi, P. O. Frederickson, T. A. Lasinski, R. S. Schreiber et al., "The nas parallel benchmarks," The International Journal of Supercomputing Applications, vol. 5, no. 3, pp. 63--73, 1991.
[19]
M. W. Berry, K. A. Gallivan, E. Gallopoulos, A. Grama, B. Philippe, Y. Saad, and F. Saied, High-performance scientific computing: algorithms and applications. Springer Science & Business Media, 2012.
[20]
"Message Passing Interface (MPI)," http://www.mpi-forum.org. Accessed: September 9, 2020.
[21]
"Recent Efforts of the MPI Forum for MPI-4 and Future MPI Standards," https://www.osti.gov/servlets/purl/1492628. Accessed: September 9, 2020.
[22]
S. Kumar, P. Heidelberger, D. Chen, and M. Hines, "Optimization of applications with non-blocking neighborhood collectives via multisends on the blue gene/p supercomputer," in 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS). IEEE, 2010.
[23]
S. Kumar, G. Dozsa, G. Almasi, P. Heidelberger, D. Chen, M. E. Giampapa, M. Blocksome, A. Faraj, J. Parker, J. Ratterman et al., "The deep computing messaging framework: generalized scalable message passing on the Blue Gene/P supercomputer."
[24]
T. Hoefler, R. Rabenseifner, H. Ritzdorf, B. R. de Supinski, R. Thakur, and J. L. Träff, "The scalable process topology interface of MPI 2.2," Concurrency and Computation: Practice and Experience, vol. 23, no. 4, 2011.
[25]
A. Ovcharenko, D. Ibanez, F. Delalondre, O. Sahni, K. E. Jansen, C. D. Carothers, and M. S. Shephard, "Neighborhood communication paradigm to increase scalability in large-scale dynamic scientific applications," Parallel Computing, vol. 38, no. 3, 2012.
[26]
K. Kandalla, A. Buluç, H. Subramoni, K. Tomko, J. Vienne, L. Oliker, and D. K. Panda, "Can network-offload based non-blocking neighborhood MPI collectives improve communication overheads of irregular graph algorithms?" in Cluster Computing Workshops (CLUSTER WORKSHOPS), 2012 IEEE International Conference on. IEEE, 2012.
[27]
O. Selvitopi and C. Aykanat, "Regularizing irregularly sparse point-to-point communications," in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2019, pp. 1--14.
[28]
S. Ghosh, M. Halappanavar, A. Kalyanaraman, A. Khan, and A. Gebremedhin, "Exploring mpi communication models for graph applications using graph matching as a case study," in 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS), May 2019.
[29]
A. A. Awan, K. Hamidouche, A. Venkatesh, and D. K. Panda, "Efficient large message broadcast using nccl and cuda-aware mpi for deep learning," in Proceedings of the 23rd European MPI Users' Group Meeting, ser. EuroMPI 2016. New York, NY, USA: ACM, 2016.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SC '20: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
November 2020
1454 pages
ISBN:9781728199986

Sponsors

In-Cooperation

  • IEEE CS

Publisher

IEEE Press

Publication History

Published: 09 November 2020

Check for updates

Author Tags

  1. MPI neighborhood collective
  2. communication pattern
  3. virtual topology

Qualifiers

  • Research-article

Conference

SC '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 176
    Total Downloads
  • Downloads (Last 12 months)9
  • Downloads (Last 6 weeks)3
Reflects downloads up to 17 Oct 2024

Other Metrics

Citations

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media