Abstract
As large-scale clusters become more distributed and heterogeneous, significant research interest has emerged in optimizing MPI collective operations because of the performance gains that can be realized. However, researchers wishing to develop new algorithms for MPI collective operations are typically faced with significant design, implementation, and logistical challenges. To address a number of needs in the MPI research community, Open MPI has been developed, a new MPI-2 implementation centered around a lightweight component architecture that provides a set of component frameworks for realizing collective algorithms, point-to-point communication, and other aspects of MPI implementations. In this chapter, we focus on the collective algorithm component framework. The “coll” framework provides tools for researchers to easily design, implement, and experiment with new collective algorithms in the context of a production-quality MPI. Performance results with basic collective operations demonstrate that the component architecture of Open MPI does not introduce any performance penalty.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
R.T. Aulwes, D.J. Daniel, N.N. Desai, R.L. Graham, L.D. Risinger, M.W. Sukalski, M.A. Taylor, and T.S. Woodall. Architecture of LA-MPI, a network-fault-tolerant MPI. In Proceedings of IPDPS’04, April 2004.
G. Burns, R. Daoud, and J. Vaigl. LAM: An Open Cluster Environment for MPI. In Proceedings of Supercomputing Symposium, pages 379–386, 1994.
G.E. Fagg, A. Bukovsky, and J.J. Dongarra. HARNESS and fault tolerant MPI. Parallel Computing, 27:1479–1496, 2001.
G.E. Fagg, E. Gabriel, Z. Chen, T. Angskun, G. Bosilca, A. Bukovski, and J.J. Dongarra. Fault Tolerant Communication Library and Applications for High Perofrmance. In Los Alamos Computer Science Institute Symposium, Santa Fe, October 27–29 2003.
E. Gabriel, G.E. Fagg, G. Bosilca, T. Angskun, J.J. Dongarra, J.M. Squyres, V. Sahay, P. Kambadur, B. Barrett, A. Lumsdaine, R.H. Castain, D.J. Daniel, R.L. Graham, and T.S. Woodall. Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation. In Proceedings, Euro PVM/MPI, Budapest, Hungary, September 2004.
A. Geist, W. Gropp, S. Huss-Lederman, A. Lumsdaine, E. Lusk, W. Saphir, T. Skjellum, and M. Snir. MPI-2: Extending the Message-Passing Interface. In Proceedings of Euro-Par’96, LNCS, 1123:128–135, Springer, 1996.
R.L. Graham, S.E. Choi, D.J. Daniel, N.N. Desai, R.G. Minnich, C.E. Rasmussen, L.D. Risinger, and M.W. Sukalksi. A Network-failure-tolerant Message-passing System for Terascale Clusters. International Journal of Parallel Programming, 31(4):285–303, August 2003.
W. Gropp, E. Lusk, N. Doss, and A. Skjellum. A High-performance, Portable Implementation of the MPI Message Passing Interface Standard. Parallel Computing, 22(6):789–828, September 1996.
W.D. Gropp and E. Lusk. User’s Guide for mpich, a Portable Implementation of MPI. Mathematics and Computer Science Division, Argonne National Laboratory, ANL-96/6, 1996.
N. Karonis, B. de Supinski, I. Foster, W. Gropp, E. Lusk, and J. Bresnahan. Exploiting Hierarchy in Parallel Computer Networks to Optimize Collective Operation Performance. In Proceedings of IPDPS’00, pages 377–84, May 2000.
A. Karwande, X. Yuan, and D. Lowenthal. CCMPI: A Compiled Communication Capable MPI Prototype for Ethernet Switched Clusters. In ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, San Diego, June 2003.
R. Keller, E. Gabriel, B. Krammer, M.S. Müller, and M.M. Resch. Towards Efficient Execution of MPI Applications on the Grid: Porting and Optimization Issues. Journal of Grid Computing, 1(2): 133–149, 2003.
T. Kielmann, H.E. Bal, and S. Gorlatch. Bandwidth-efficient Collective Communication for Clustered Wide Area Systems. In Proceedings of IPDPS’00, pages 492–199, May 2000.
T. Kielmann, R.F.H. Hofman, H.E. Bal, A. Plaat, and R.A.F. Bhoedjang. MagPIe: MPI’s Collective Communication Operations for Clustered Wide Area Systems. ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’ 99), pages 131–140, May 1999.
S. P. Kini, J. Liu, J. Wu, P. Wyckoff, and D. K. Panda. Fast and Scalable Barrier Using RDM A and Multicast Mechanisms for InfiniBand-Based Cluster. In Proceedings of Euro PVM/MPI, LNCS, 2840, Springer, 2003.
J.M. Mellor-Crummey and M.L. Scott. Algorithms for Scalable Synchronization on Shared-memory Multiprocessors. ACM Transactions on Computer Systems, 9(1):21–65, 1991.
Message Passing Interface Forum. MPI: A Message Passing Interface. In Proc. of Super-computing’ 93, pages 878–883. IEEE Computer Society Press, November 1993.
S. Sankaran, J.M. Squyres, B. Barrett, A. Lumsdaine, J. Duell, P. Hargrove, and E. Roman. The LAM/MPI Checkpoint/Restart Framework: System-initiated Checkpointing. In Proceedings of LACSI Symposium, Sante Fe, October 2003.
J.M. Squyres and A. Lumsdaine. A Component Architecture for LAM/MPI. In Proceedings of Euro PVM/MPI, LNCS, 2840, Springer, 2003.
C. Szyperski, D. Druntz, and S. Murer. Component Software: Beyond Object-Oriented Programming. Addison Wesley, second edition, 2002.
R. Thakur and W. Gropp. Improving the Performance of MPI Collective Communication on Switched Networks. Technical report ANL/MCS-P1007-1102, Mathematics and Computer Science Division, Argonne National Laboratory, November 2002. ftp://info.mcs.anl.gov/pub/tech_reports/reports/P1007.pdf.
R. Thakur and W. Gropp. Improving the Performance of Collective Operations in MPICH. In Proceedings of Euro PVM/MPI, LNCS, 2840, Springer, 2003.
T.S. Woodall, R.L. Graham, R.H. Castain, D.J. Daniel, M.W. Sukalski, G.E. Fagg, E. Gabriel, G. Bosilca, T. Angskun, J.J. Dongarra, J.M. Squyres, V. Sahay, P. Kambadur, B. Barrett, and A. Lumsdaine. TEG: A High-performance, Scalable, Multi-network Point-to-point Communications Methodology. In Proceedings of Euro PVM/MPI, Budapest, Hungary, September 2004.
Q. Zhang. MPI Collective Operations Over Myrinet. Master’s thesis, The University of British Columbia, Department of Computer Science, June 2002.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer Science + Business Media, Inc.
About this paper
Cite this paper
M. Squyres, J., Lumsdaine, A. (2005). The Component Architecture of Open MPI: Enabling Third-Party Collective Algorithms*. In: Getov, V., Kielmann, T. (eds) Component Models and Systems for Grid Applications. Springer, Boston, MA. https://doi.org/10.1007/0-387-23352-0_11
Download citation
DOI: https://doi.org/10.1007/0-387-23352-0_11
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-23351-2
Online ISBN: 978-0-387-23352-9
eBook Packages: Computer ScienceComputer Science (R0)