Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1851476.1851507acmconferencesArticle/Chapter ViewAbstractPublication PageshpdcConference Proceedingsconference-collections
research-article

Scalability of communicators and groups in MPI

Published: 21 June 2010 Publication History

Abstract

As the number of cores inside compute clusters continues to grow, the scalability of MPI (Message Passing Interface) is important to ensure that programs can continue to execute on an ever-increasing number of cores. One important scalability issue for MPI is the implementation of communicators and groups. Communicators and groups are an integral part of MPI and play an essential role in the design and use of libraries. It is challenging to create an MPI implementation to support communicators and groups to scale to the hundreds of thousands of processes that are possible in today's clusters. In this paper we present the design and evaluation of techniques to support the scalability of communicators and groups in MPI.
We have designed and implemented a fine-grain version of MPI (FG-MPI) based on MPICH2, that allows thousands of full-fledged MPI processes inside an operating system process. Using FG-MPI we can create hundreds and thousands of MPI processes, which allowed us to implement and evaluate solutions to the scalability issues associated with communicators. We describe techniques to allow for sharing of group information inside processes, and the design of scalable operations to create the communicators. A set plus permutation framework is introduced for storing group information for communicators and a set, instead of map, representation is proposed for MPI group objects. Performance results are given for the execution of a MPI benchmark program with upwards of 100,000 processes with communicators created for various groups of different sizes and types.

References

[1]
}}BuDDy - A Binary Decision Diagram Package, http://vlsicad.eecs.umich.edu/BK/Slots/cache/www.itu.dk/research/buddy/index.html.
[2]
}}Argonne National Laboratory. Communicators and Context IDs. Available from http://wiki.mcs.anl.gov/mpich2/index.php/Communicators_and_Context_IDs.
[3]
}}Argonne National Laboratory. MPICH2: A high performance and portable implementation of MPI standard. Available from http://www.mcs.anl.gov/research/projects/mpich2/index.php.
[4]
}}P. Balaji, D. Buntinas, D. Goodell, W. Gropp, S. Kumar, E. L. Lusk, R. Thakur, and J. L. Träff. MPI on a million processors. In PVM/MPI, pages 20--30, 2009.
[5]
}}J. Barbay and G. Navarro. Compressed representations of permutations, and applications. In S. Albers and J.-Y. Marion, editors, 26th Intl. Symp. on Theoretical Aspects of Computer Science (STACS), pages 111--122, Dagstuhl, Germany, 2009.
[6]
}}V. R. Basili, J. C. Carver, D. Cruzes, L. M. Hochstein, J. K. Hollingsworth, F. Shull, and M. V. Zelkowitz. Understanding the high-performance-computing community: A software engineer's perspective. IEEE Softw., 25(4):29--36, 2008.
[7]
}}R. E. Bryant. Graph-based algorithms for boolean function manipulation. IEEE Transactions on Computers, 35:677--691, 1986.
[8]
}}D. Buntinas, W. Gropp, and G. Mercier. Design and evaluation of Nemesis, a scalable, low-latency, message-passing communication subsystem. In Proc. of the Sixth IEEE Intl. Symp. on Cluster Computing and the Grid (CCGRID), pages 521--530, Washington, DC, USA, 2006. IEEE Computer Society.
[9]
}}M. Burrows and D. J. Wheeler. A block-sorting lossless data compression algorithm. Technical Report 124, Digital Equipment Corp., 1994.
[10]
}}M. Chaarawi and E. Gabriel. Evaluating sparse data storage techniques for mpi groups and communicators. In ICCS '08: Proceedings of the 8th international conference on Computational Science, Part I, pages 297--306, Berlin, Heidelberg, 2008. Springer-Verlag.
[11]
}}E. D. Demaine, I. Foster, C. Kesselman, and M. Snir. Generalized communicators in the message passing interface. IEEE Trans. Parallel Distrib. Syst., 12(6):610--616, 2001.
[12]
}}W. Gropp, E. Lusk, and A. Skjellum. Using MPI (2nd ed.): Portable parallel programming with the message-passing interface. MIT Press, Cambridge, MA, USA, 1999.
[13]
}}W. D. Gropp and R. Thakur. Issues in developing a thread-safe MPI implementation. In PVM/MPI, pages 12--21, 2006.
[14]
}}W.-K. Hon, K. Sadakane, and W.-K. Sung. Breaking a time-and-space barrier in constructing full-text indices. SIAM J. Comput., 38(6):2162--2178, 2009.
[15]
}}T. C. Hu and A. C. Tucker. Optimal computer search trees and variable-length alphabetical codes. SIAM Journal on Applied Mathematics, 21(4):514--532, 1971.
[16]
}}H. Kamal and A. Wagner. FG-MPI: Fine-grain MPI for multicore and clusters. In 11th IEEE Intl. Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC) held in conjunction with IPDPS-24, April 2010.
[17]
}}D. Okanohara and K. Sadakane. A linear-time burrows-wheeler transform using induced sorting. In SPIRE, pages 90--101, 2009.
[18]
}}J. Seward. bzip2 and libbzip2, version 1.0.5 a program and library for data compression. Available from http://www.bzip.org/.
[19]
}}M. Stabno and R. Wrembel. RLH: Bitmap compression technique based on run-length and huffman encoding. Info. Sys., 34(4--5):400--414, 2009.
[20]
}}E. Toernig. Coroutine library. Available from http://www.goron.de/~froese/coro/coro.html.
[21]
}}TOP500. Top 500 supercomputing sites. Available from http://www.top500.org/.
[22]
}}R. von Behren, J. Condit, F. Zhou, G. C. Necula, and E. Brewer. Capriccio: scalable threads for internet services. In SOSP '19, pages 268--281, New York, NY, USA, 2003. ACM.

Cited By

View all
  • (2021)Reconfigurable switches for high performance and flexible MPI collectivesConcurrency and Computation: Practice and Experience10.1002/cpe.676934:6Online publication date: 12-Dec-2021
  • (2020)FPGAs in the Network and Novel Communicator Support Accelerate MPI Collectives2020 IEEE High Performance Extreme Computing Conference (HPEC)10.1109/HPEC43674.2020.9286200(1-10)Online publication date: 22-Sep-2020
  • (2019)MPI Sessions: Evaluation of an Implementation in Open MPI2019 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER.2019.8891002(1-11)Online publication date: Sep-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
HPDC '10: Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
June 2010
911 pages
ISBN:9781605589428
DOI:10.1145/1851476
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 June 2010

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. message passing interface
  2. multicore
  3. parallel programming

Qualifiers

  • Research-article

Conference

HPDC '10
Sponsor:

Acceptance Rates

Overall Acceptance Rate 166 of 966 submissions, 17%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 08 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2021)Reconfigurable switches for high performance and flexible MPI collectivesConcurrency and Computation: Practice and Experience10.1002/cpe.676934:6Online publication date: 12-Dec-2021
  • (2020)FPGAs in the Network and Novel Communicator Support Accelerate MPI Collectives2020 IEEE High Performance Extreme Computing Conference (HPEC)10.1109/HPEC43674.2020.9286200(1-10)Online publication date: 22-Sep-2020
  • (2019)MPI Sessions: Evaluation of an Implementation in Open MPI2019 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER.2019.8891002(1-11)Online publication date: Sep-2019
  • (2018)Reference broadcast frame synchronization for distributed high-speed camera network2018 IEEE Sensors Applications Symposium (SAS)10.1109/SAS.2018.8336781(1-5)Online publication date: Mar-2018
  • (2018)Lightweight MPI Communicators with Applications to Perfectly Balanced Quicksort2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS.2018.00035(254-265)Online publication date: May-2018
  • (2017)Memory Compression Techniques for Network Address Management in MPI2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS.2017.18(1008-1017)Online publication date: May-2017
  • (2016)DISPProceedings of the First Workshop on Optimization of Communication in HPC10.5555/3018058.3018064(53-62)Online publication date: 13-Nov-2016
  • (2016)MPI SessionsProceedings of the 23rd European MPI Users' Group Meeting10.1145/2966884.2966915(121-129)Online publication date: 25-Sep-2016
  • (2016)DISP: Optimizations towards Scalable MPI Startup2016 First International Workshop on Communication Optimizations in HPC (COMHPC)10.1109/COMHPC.2016.011(53-62)Online publication date: Nov-2016
  • (2015)A Team-Based Methodology of Memory Hierarchy-Aware Runtime Support in Coarray FortranProceedings of the 2015 IEEE International Conference on Cluster Computing10.1109/CLUSTER.2015.67(448-451)Online publication date: 8-Sep-2015
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media