Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3018743.3019025acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
poster

POSTER: Cache-Oblivious MPI All-to-All Communications on Many-Core Architectures

Published: 26 January 2017 Publication History

Abstract

In the many-core era, the performance of MPI collectives is more dependent on the intra-node communication component. However, the communication algorithms generally inherit from the inter-node version and ignore the cache complexity. We propose cache-oblivious algorithms for MPI all-to-all operations, in which data blocks are copied into the receive buffers in Morton order to exploit data locality. Experimental results on different many-core architectures show that our cache-oblivious implementations significantly outperform the naive implementations based on shared heap and the highly optimized MPI libraries.

References

[1]
M. Frigo, C. E. Leiserson, H. Prokop, and S. Ramachandran. Cache-oblivious algorithms. ACM Transactions on Algorithms (TALG), 8 (1): 4, 2012.
[2]
S. Li, T. Hoefler, and M. Snir. NUMA-aware shared-memory collective communication for MPI. In Proceedings of the 22nd international symposium on High-performance parallel and distributed computing, pages 85--96. ACM, 2013.
[3]
S. Li, T. Hoefler, C. Hu, and M. Snir. Improved MPI collectives for MPI processes in shared address spaces. Cluster Computing, 17 (4): 1139--1155, 2014.
[4]
G. M. Morton. A computer oriented geodetic data base and a new technique in file sequencing. International Business Machines Company New York, 1966.
[5]
012)]MPIMPI Forum. MPI: A Message-Passing Interface standard. Version 3.0, September 2012.
[6]
R. Thakur, R. Rabenseifner, and W. Gropp. Optimization of collective communication operations in MPICH. International Journal of High Performance Computing Applications, 19 (1): 49--66, 2005.

Cited By

View all
  • (2019)Wavelength Allotment for All-to-All Broadcast in WDM Optical Modified Linear Array for Reliable CommunicationMobile Networks and Applications10.1007/s11036-017-0908-824:2(350-356)Online publication date: 1-Apr-2019
  • (2019)Computational aspects of nanostructures: PW vs AO calculations10.1063/1.5091118(020001)Online publication date: 2019
  • (2019)Diverse Demands Estimation and Ranking Based on User BehaviorsHigh-Performance Computing Applications in Numerical Simulation and Edge Computing10.1007/978-981-32-9987-0_7(69-78)Online publication date: 29-Aug-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PPoPP '17: Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
January 2017
476 pages
ISBN:9781450344937
DOI:10.1145/3018743
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 January 2017

Check for updates

Author Tags

  1. cache-oblivious algorithms
  2. many-core
  3. mpi_alltoall

Qualifiers

  • Poster

Funding Sources

Conference

PPoPP '17
Sponsor:

Acceptance Rates

PPoPP '17 Paper Acceptance Rate 29 of 132 submissions, 22%;
Overall Acceptance Rate 230 of 1,014 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)13
  • Downloads (Last 6 weeks)1
Reflects downloads up to 03 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2019)Wavelength Allotment for All-to-All Broadcast in WDM Optical Modified Linear Array for Reliable CommunicationMobile Networks and Applications10.1007/s11036-017-0908-824:2(350-356)Online publication date: 1-Apr-2019
  • (2019)Computational aspects of nanostructures: PW vs AO calculations10.1063/1.5091118(020001)Online publication date: 2019
  • (2019)Diverse Demands Estimation and Ranking Based on User BehaviorsHigh-Performance Computing Applications in Numerical Simulation and Edge Computing10.1007/978-981-32-9987-0_7(69-78)Online publication date: 29-Aug-2019
  • (2019)Topology Layout Technology of Energy InternetHigh-Performance Computing Applications in Numerical Simulation and Edge Computing10.1007/978-981-32-9987-0_20(235-245)Online publication date: 29-Aug-2019

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media