From the Publisher:
MPI, the Message Passing Interface, is a standard and portable library of communications subroutines for parallel programming designed to function on a wide variety of parallel computers. It is useful on both parallel computers, such as IBM's SP2, the Cray ResearchT3D, and the Connection Machine, as well as networks of workstations. Written by five of the principal creators of the latest MPI standard MPI: The Complete Reference is an annotated manual for the latest 1.1 version of the standard that illuminates the more advanced and subtle features of MPI. It can be read in conjunction with the companion tutorial volume, Using MPI: Portable Parallel Programming with the Message-Passing Interface , by William Gropp, Ewing Lusk, and Anthony Skjellum.
MPI: The Complete Reference is the only source that covers such advanced issues in parallel computing and programming as true portability, deadlock, high-performance message passing, and libraries for distributed and parallel computing. The annotations provide numerous illustrative programming examples and delve into even the most esoteric features or consequences of the standard. They explain why certain design choices were made, how users should use the interface, and how implementors should construct their own version of MPI.
Scientific and Engineering Computation series
Cited By
- Zakhour G, Weisenburger P and Salvaneschi G (2024). Automated Verification of Fundamental Algebraic Laws, Proceedings of the ACM on Programming Languages, 8:PLDI, (766-789), Online publication date: 20-Jun-2024.
- Kwon O, Kang J, Lee S, Kim W and Song J Efficient Task-Mapping of Parallel Applications Using a Space-Filling Curve Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, (384-397)
- Kulkarni A, Kovacevic J and Franchetti F A framework for low communication approaches for large scale 3D convolution Workshop Proceedings of the 51st International Conference on Parallel Processing, (1-11)
- Pan J, Xiao L, Tian M, Liu T and Wang L Heterogeneous multi-core optimization of MUMPS solver and its application Proceedings of the 2021 ACM International Conference on Intelligent Computing and its Emerging Applications, (122-127)
- Castelló A, Quintana-Ortí E and Duato J (2021). Accelerating distributed deep neural network training with pipelined MPI allreduce, Cluster Computing, 24:4, (3797-3813), Online publication date: 1-Dec-2021.
- Van Zee F, Parikh D and Geijn R (2021). Supporting Mixed-domain Mixed-precision Matrix Multiplication within the BLIS Framework, ACM Transactions on Mathematical Software, 47:2, (1-26), Online publication date: 30-Jun-2021.
- Slaughter E, Wu W, Fu Y, Brandenburg L, Garcia N, Kautz W, Marx E, Morris K, Cao Q, Bosilca G, Mirchandaney S, Lee W, Treichler S, McCormick P and Aiken A Task bench Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, (1-15)
- Budiu M, Gopalan P, Suresh L, Wieder U, Kruiger H and Aguilera M (2019). Hillview, Proceedings of the VLDB Endowment, 12:11, (1442-1457), Online publication date: 1-Jul-2019.
- Bak D, Mazurek P and Oszutowska–Mazurek D Optimization of Demodulation for Air–Gap Data Transmission Based on Backlight Modulation of Screen Computational Science – ICCS 2019, (71-80)
- Huang Z, Li M, Chousidis C, Mousavi A and Jiang C (2018). Schema Theory-Based Data Engineering in Gene Expression Programming for Big Data Analytics, IEEE Transactions on Evolutionary Computation, 22:5, (792-804), Online publication date: 1-Oct-2018.
- Ober I, Palyart M, Bruel J and Lugato D (2018). On the use of models for high-performance scientific computing applications, Software and Systems Modeling (SoSyM), 17:1, (319-342), Online publication date: 1-Feb-2018.
- Trinder P, Chechina N, Papaspyrou N, Sagonas K, Thompson S, Adams S, Aronis S, Baker R, Bihari E, Boudeville O, Cesarini F, Stefano M, Eriksson S, fördős V, Ghaffari A, Giantsios A, Green R, Hoch C, Klaftenegger D, Li H, Lundin K, Mackenzie K, Roukounaki K, Tsiouris Y and Winblad K (2017). Scaling Reliably, ACM Transactions on Programming Languages and Systems, 39:4, (1-46), Online publication date: 31-Dec-2018.
- Li Y, Li T and Liu H (2017). Recent advances in feature selection and its applications, Knowledge and Information Systems, 53:3, (551-577), Online publication date: 1-Dec-2017.
- Slaughter E, Lee W, Treichler S, Zhang W, Bauer M, Shipman G, McCormick P and Aiken A Control replication Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, (1-12)
- Feng C, Xu H and Li B (2017). An Alternating Direction Method Approach to Cloud Traffic Management, IEEE Transactions on Parallel and Distributed Systems, 28:8, (2145-2158), Online publication date: 1-Aug-2017.
- Chechina N, MacKenzie K, Thompson S, Trinder P, Boudeville O, Fördős V, Hoch C, Ghaffari A and Hernandez M (2017). Evaluating Scalable Distributed Erlang for Scalability and Reliability, IEEE Transactions on Parallel and Distributed Systems, 28:8, (2244-2257), Online publication date: 1-Aug-2017.
- (2016). Parallel alternating iterative algorithms with and without overlapping on multicore architectures, Advances in Engineering Software, 101:C, (27-36), Online publication date: 1-Nov-2016.
- (2016). MultiGrain/MAPPER, Future Generation Computer Systems, 63:C, (1-14), Online publication date: 1-Oct-2016.
- Budiu M, Isaacs R, Murray D, Plotkin G, Barham P, Al-Kiswany S, Boshmaf Y, Luo Q and Andoni A Interacting with large distributed datasets using sketch Proceedings of the 16th Eurographics Symposium on Parallel Graphics and Visualization, (31-43)
- Baños R, Ortega J, Gil C, de Toro F and Montoya M (2016). Analysis of OpenMP and MPI implementations of meta-heuristics for vehicle routing problems, Applied Soft Computing, 43:C, (262-275), Online publication date: 1-Jun-2016.
- Aubrey-Jones T and Fischer B (2016). Synthesizing MPI Implementations from Functional Data-Parallel Programs, International Journal of Parallel Programming, 44:3, (552-573), Online publication date: 1-Jun-2016.
- Hendrix V, Fox J, Ghoshal D and Ramakrishnan L Tigres workflow library Proceedings of the 16th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, (146-155)
- Ludwig S and Aljarah I (2016). A Scalable MapReduce-enabled Glowworm Swarm Optimization Approach for High Dimensional Multimodal Functions, International Journal of Swarm Intelligence Research, 7:1, (32-54), Online publication date: 1-Jan-2016.
- Slaughter E, Lee W, Treichler S, Bauer M and Aiken A Regent Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, (1-12)
- Vicente da Silva M, Castro L and Pereira E (2015). On the parallel implementation of a hybrid-mixed stress formulation, Computers and Structures, 158:C, (71-81), Online publication date: 1-Oct-2015.
- do Carmo A, Raeder M, Nunes T, Kolberg M and Fernandes L (2015). A job profile oriented scheduling architecture for improving the throughput of industrial printing environments, Computers and Industrial Engineering, 88:C, (191-205), Online publication date: 1-Oct-2015.
- Lorenzon A, Cera M and Schneider Beck A (2015). Performance and Energy Evaluation of Different Multi-Threading Interfaces in Embedded and General Purpose Systems, Journal of Signal Processing Systems, 80:3, (295-307), Online publication date: 1-Sep-2015.
- Shook E, Wren C, Marean C, Potts A, Franklin J, Engelbrecht F, O'Neal D, Janssen M, Fisher E, Hill K, Esler K, Cowling R, Scheiter S and Moncrieff G Paleoscape model of coastal South Africa during modern human origins Proceedings of the 2015 XSEDE Conference: Scientific Advancements Enabled by Enhanced Cyberinfrastructure, (1-8)
- Wang K, Zhou X, Qiao K, Lang M, McClelland B and Raicu I Towards Scalable Distributed Workload Manager with Monitoring-Based Weakly Consistent Resource Stealing Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing, (219-222)
- Papadopoulos I, Thomas N, Fidel A, Amato N and Rauchwerger L STAPL-RTS Proceedings of the 29th ACM on International Conference on Supercomputing, (425-434)
- Galizia A, D'Agostino D and Clematis A (2015). An MPI-CUDA library for image processing on HPC architectures, Journal of Computational and Applied Mathematics, 273:C, (414-427), Online publication date: 1-Jan-2015.
- Bragard Q, Ventresque A and Murphy L Global dynamic load-balancing for decentralised distributed simulation Proceedings of the 2014 Winter Simulation Conference, (3797-3808)
- Balderrama J, Simonin M, Ramakrishnan L, Hendrix V, Morin C, Agarwal D and Tedeschi C Combining workflow templates with a shared space-based execution model Proceedings of the 9th Workshop on Workflows in Support of Large-Scale Science, (50-58)
- Kurt M and Agrawal G DISC Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, (869-880)
- Chen J and Edelman A Parallel prefix polymorphism permits parallelization, presentation & proof Proceedings of the 1st First Workshop for High Performance Technical Computing in Dynamic Languages, (47-56)
- Marker B, Batory D and van de Geijn R Understanding performance stairs Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering, (301-312)
- Träff J and Rougier A Zero-copy, Hierarchical Gather is not possible with MPI Datatypes and Collectives Proceedings of the 21st European MPI Users' Group Meeting, (39-44)
- Treichler S, Bauer M and Aiken A Realm Proceedings of the 23rd international conference on Parallel architectures and compilation, (263-276)
- Hold-Geoffroy Y, Gagnon O and Parizeau M Once you SCOOP, no need to fork Proceedings of the 2014 Annual Conference on Extreme Science and Engineering Discovery Environment, (1-8)
- Wang K, Zhou X, Chen H, Lang M and Raicu I Next generation job management systems for extreme-scale ensemble computing Proceedings of the 23rd international symposium on High-performance parallel and distributed computing, (111-114)
- Agarwal D, Karamati S, Puri S and Prasad S Towards an MPI-like framework for the Azure cloud platform Proceedings of the 14th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, (176-185)
- Bragard Q, Ventresque A and Murphy L Synchronisation for dynamic load balancing of decentralised conservative distributed simulation Proceedings of the 2nd ACM SIGSIM Conference on Principles of Advanced Discrete Simulation, (117-126)
- Karacapilidis N, Christodoulou S, Tzagarakis M, Tsiliki G and Pappis C Strengthening collaborative data analysis and decision making in web communities Proceedings of the 23rd International Conference on World Wide Web, (1005-1010)
- Ruetsch G and Fatica M (2013). CUDA Fortran for Scientists and Engineers, 10.5555/2588277, Online publication date: 1-Oct-2013.
- Jakovits P and Srirama S Clustering on the cloud Proceedings of the Second Nordic Symposium on Cloud Computing & Internet Technologies, (64-71)
- Andelfinger P and Hartenstein H Towards performance evaluation of conservative distributed discrete-event network simulations using second-order simulation Proceedings of the 1st ACM SIGSIM Conference on Principles of Advanced Discrete Simulation, (221-230)
- Jakovits P and Srirama S Adapting scientific applications to cloud by using distributed computing frameworks Proceedings of the 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, (164-167)
- Depolli M, Trobec R and Filipič B (2013). Asynchronous master-slave parallelization of differential evolution for multi-objective optimization, Evolutionary Computation, 21:2, (261-291), Online publication date: 1-May-2013.
- Wang K, Brandstatter K and Raicu I SimMatrix Proceedings of the High Performance Computing Symposium, (1-9)
- Capuzzo-Dolcetta R, Spera M and Punzo D (2013). A fully parallel, high precision, N-body code running on hybrid computing platforms, Journal of Computational Physics, 236, (580-593), Online publication date: 1-Mar-2013.
- Kishimoto A, Fukunaga A and Botea A (2013). Evaluation of a simple, scalable, parallel best-first search strategy, Artificial Intelligence, 195, (222-248), Online publication date: 1-Feb-2013.
- de la Calle G, Alonso-Martínez E, Tzagarakis M and Karacapilidis N The dicode workbench Proceedings of the 14th International Conference on Information Integration and Web-based Applications & Services, (16-25)
- Palyart M, Ober I, Lugato D and Bruel J HPCML Proceedings of the 1st International Workshop on Model-Driven Engineering for High Performance and CLoud computing, (1-6)
- D'Ambrosio D, Filippone G, Rongo R, Spataro W and Trunfio G (2012). Cellular Automata and GPGPU, International Journal of Grid and High Performance Computing, 4:3, (30-47), Online publication date: 1-Jul-2012.
- Boudeville O, Cesarini F, Chechina N, Lundin K, Papaspyrou N, Sagonas K, Thompson S, Trinder P and Wiger U RELEASE Proceedings of the 2012 Conference on Trends in Functional Programming - Volume 7829, (263-278)
- Ben-Hafaiedh I, Graf S and Mazouz N Distributed implementation of systems with multiparty interactions and priorities Proceedings of the 9th international conference on Software engineering and formal methods, (38-57)
- Bykov S, Geller A, Kliot G, Larus J, Pandya R and Thelin J Orleans Proceedings of the 2nd ACM Symposium on Cloud Computing, (1-14)
- Palyart M, Lugato D, Ober I and Bruel J Improving scalability and maintenance of software for high-performance scientific computing by combining MDE and frameworks Proceedings of the 14th international conference on Model driven engineering languages and systems, (213-227)
- Gava F, Gesbert L and Loulergue F Type system for a safe execution of parallel programs in BSML Proceedings of the fifth international workshop on High-level parallel programming and applications, (27-34)
- Armih K, Michaelson G and Trinder P Cache size in a cost model for heterogeneous skeletons Proceedings of the fifth international workshop on High-level parallel programming and applications, (3-10)
- Lins R, de F. Pereira e Silva G and de A. Formiga A HistDoc v. 2.0 Proceedings of the 2011 Workshop on Historical Document Imaging and Processing, (169-176)
- Sedukhin S and Paprzycki M Generalizing matrix multiplication for efficient computations on modern computers Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part I, (225-234)
- Pavlis G (2011). Three-dimensional, wavefield imaging of broadband seismic array data, Computers & Geosciences, 37:8, (1054-1066), Online publication date: 1-Aug-2011.
- Palyart M, Lugato D, Ober I and Bruel J MDE4HPC Proceedings of the 15th international conference on Integrating System and Software Modeling, (247-261)
- Stewart A (2011). A programming model for BSP with partitioned synchronisation, Formal Aspects of Computing, 23:4, (421-432), Online publication date: 1-Jul-2011.
- Radcliffe N, Sosonkina M and Watson L Communication with spawned processes Proceedings of the 19th High Performance Computing Symposia, (126-133)
- Le Digabel S (2011). Algorithm 909, ACM Transactions on Mathematical Software, 37:4, (1-15), Online publication date: 1-Feb-2011.
- Mattson T, Riepen M, Lehnig T, Brett P, Haas W, Kennedy P, Howard J, Vangal S, Borkar N, Ruhl G and Dighe S The 48-core SCC Processor Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, (1-11)
- Scherer W, Adhianto L, Jin G, Mellor-Crummey J and Yang C Hiding latency in Coarray Fortran 2.0 Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model, (1-9)
- Träff J Compact and efficient implementation of the MPI group operations Proceedings of the 17th European MPI users' group meeting conference on Recent advances in the message passing interface, (170-178)
- Fournié M, Renon N, Renard Y and Ruiz D CFD parallel simulation using Getfem++ and mumps Proceedings of the 16th international Euro-Par conference on Parallel processing: Part II, (77-88)
- Rosner N, Galeotti J, Lopez Pombo C and Frias M ParAlloy Proceedings of the Second international conference on Abstract State Machines, Alloy, B and Z, (396-397)
- Miller R and Stout Q Algorithmic techniques for regular networks of processors Algorithms and theory of computation handbook, (24-24)
- Epicoco I, Mocavero S and Aloisio G Experience on the parallelization of the OASIS3 coupler Proceedings of the Eighth Australasian Symposium on Parallel and Distributed Computing - Volume 107, (51-60)
- Vömel C (2010). ScaLAPACK's MRRR algorithm, ACM Transactions on Mathematical Software, 37:1, (1-35), Online publication date: 1-Jan-2010.
- Mellor-Crummey J, Adhianto L, Scherer W and Jin G A new vision for coarray Fortran Proceedings of the Third Conference on Partitioned Global Address Space Programing Models, (1-9)
- Saito P, Wolf D, Branco K and Sabatine R Parallel implementation of mobile robotic self-localization Proceedings of the 2009 International Conference on Hybrid Information Technology, (390-396)
- Benoit A, Robert Y, Rosenberg A and Vivien F Static worksharing strategies for heterogeneous computers with unrecoverable failures Proceedings of the 2009 international conference on Parallel processing, (71-80)
- Agrawal K, Benoit A, Dufossé F and Robert Y Mapping filtering streaming applications with communication costs Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures, (19-28)
- Leon C, Miranda G and Segura C A memetic algorithm and a parallel hyperheuristic island-based model for a 2D packing problem Proceedings of the 11th Annual conference on Genetic and evolutionary computation, (1371-1378)
- Holladay K Characterizing the genetic programming environment for fifth (GPE5) on a high performance computing cluster Proceedings of the 11th Annual conference on Genetic and evolutionary computation, (1363-1370)
- Jacob J, Katz D, Berriman G, Good J, Laity A, Deelman E, Kesselman C, Singh G, Su M, Prince T and Williams R (2009). Montage: a grid portal and software toolkit for science-grade astronomical image mosaicking, International Journal of Computational Science and Engineering, 4:2, (73-87), Online publication date: 1-Jul-2009.
- Liu J and Abali B Virtualization polling engine (VPE) Proceedings of the 23rd international conference on Supercomputing, (225-234)
- MacDonald S, Tan K, Schaeffer J and Szafron D (2009). Deferring design pattern decisions and automating structural pattern changes using a design-pattern-based programming system, ACM Transactions on Programming Languages and Systems, 31:3, (1-49), Online publication date: 1-Apr-2009.
- Sijoy C and Chaturvedi S (2009). Finite difference time domain algorithm for electromagnetic problems involving material movement, Journal of Computational Physics, 228:6, (2282-2295), Online publication date: 1-Apr-2009.
- Dauger D and Decyk V (2009). Plug-and-Play Cluster Computing, Computing in Science and Engineering, 7:2, (27-33), Online publication date: 1-Mar-2009.
- Kejariwal A, Veidenbaum A, Nicolau A, Girkar M, Tian X and Saito H (2009). On the exploitation of loop-level parallelism in embedded applications, ACM Transactions on Embedded Computing Systems, 8:2, (1-34), Online publication date: 1-Jan-2009.
- Gibbs C, Baldwin J, Singh N, D'Hondt M and Coady Y Living with the Law Proceedings of the 23rd IEEE/ACM International Conference on Automated Software Engineering, (395-398)
- Bientinesi P, Gunter B and Geijn R (2008). Families of algorithms related to the inversion of a Symmetric Positive Definite matrix, ACM Transactions on Mathematical Software, 35:1, (1-22), Online publication date: 22-Jul-2008.
- Mattos G, Lins R, de Araújo Formiga A and Junqueira Martins F BigBatch Proceedings of the 2008 ACM symposium on Applied computing, (434-441)
- Ryoo S, Rodrigues C, Baghsorkhi S, Stone S, Kirk D and Hwu W Optimization principles and application performance evaluation of a multithreaded GPU using CUDA Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming, (73-82)
- Psota J and Agarwal A rMPI Proceedings of the 3rd international conference on High performance embedded architectures and compilers, (22-37)
- Schlansker M, Chitlur N, Oertli E, Stillwell P, Rankin L, Bradford D, Carter R, Mudigonda J, Binkert N and Jouppi N High-performance ethernet-based communications for future multi-core processors Proceedings of the 2007 ACM/IEEE conference on Supercomputing, (1-12)
- Husbands P and Yelick K Multi-threading and one-sided communication in parallel LU factorization Proceedings of the 2007 ACM/IEEE conference on Supercomputing, (1-10)
- Ananthanarayanan R and Modha D Anatomy of a cortical simulator Proceedings of the 2007 ACM/IEEE conference on Supercomputing, (1-12)
- Di Saverio E, Cesati M, Di Biagio C, Pennella G and Engelmann C Distributed real-time computing with harness Proceedings of the 14th European conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface, (281-288)
- Barrett B, Shipman G and Lumsdaine A Analysis of implementation options for MPI-2 one-sided Proceedings of the 14th European conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface, (242-250)
- Lastovetsky A, O'Flynn M and Rychkov V Optimization of collective communications in HeteroMPI Proceedings of the 14th European conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface, (135-143)
- Fernandes L, Nunes T, Raeder M, Giannetti F, Cabeda A and Bedin G An improved parallel XSL-FO rendering for personalized documents Proceedings of the 14th European conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface, (56-63)
- Tesson J and Loulergue F Formal semantics of DRMA-style programming in BSPlib Proceedings of the 7th international conference on Parallel processing and applied mathematics, (1122-1129)
- Singer D and Monnet A JaCk-SAT Proceedings of the 7th international conference on Parallel processing and applied mathematics, (249-258)
- Kulla F and Sanders P (2007). Scalable parallel suffix array construction, Parallel Computing, 33:9, (605-612), Online publication date: 1-Sep-2007.
- Chamberlain B, Callahan D and Zima H (2007). Parallel Programmability and the Chapel Language, International Journal of High Performance Computing Applications, 21:3, (291-312), Online publication date: 1-Aug-2007.
- Al-Shabibi A, Gerlach S, Hersch R and Schaeli B A debugger for flow graph based parallel applications Proceedings of the 2007 ACM workshop on Parallel and distributed systems: testing and debugging, (14-20)
- Phillips R, Watson L and Wynne R (2007). Hybrid image classification and parameter selection using a shared memory parallel algorithm, Computers & Geosciences, 33:7, (875-897), Online publication date: 1-Jul-2007.
- Coarfa C, Mellor-Crummey J, Froyd N and Dotsenko Y Scalability analysis of SPMD codes using expectations Proceedings of the 21st annual international conference on Supercomputing, (13-22)
- Snyder L The design and development of ZPL Proceedings of the third ACM SIGPLAN conference on History of programming languages, (8-1-8-37)
- Stroustrup B Evolving a language in and for the real world Proceedings of the third ACM SIGPLAN conference on History of programming languages, (4-1-4-59)
- Łukasik S Parallel Computing of Kernel Density Estimates with MPI Proceedings of the 7th international conference on Computational Science, Part III: ICCS 2007, (726-733)
- Park M, Shim S, Jun Y and Park H MPIRace-check Proceedings of the 2nd international conference on Advances in grid and pervasive computing, (322-333)
- McCoy N, Mahony S and Golden A Gene prediction in metagenomic libraries using the self organising map and high performance computing techniques Proceedings of the 2006 international conference on Distributed, high-performance and grid computing in computational biology, (99-109)
- Prodan R and Fahringer T (2007). Grid computing, 10.5555/1791434, Online publication date: 1-Jan-2007.
- Rufinus J and Kortsarts Y (2007). One-dimensional heat distribution problem and parallel computing concepts, Journal of Computing Sciences in Colleges, 22:3, (74-81), Online publication date: 1-Jan-2007.
- Dhillon I, Parlett B and Vömel C (2006). The design and implementation of the MRRR algorithm, ACM Transactions on Mathematical Software, 32:4, (533-560), Online publication date: 1-Dec-2006.
- Klie H, Bangerth W, Gai X, Wheeler M, Stoffa P, Sen M, Parashar M, Catalyurek U, Saltz J and Kurc T (2006). Models, methods and middleware for grid-enabled multiphysics oil reservoir management, Engineering with Computers, 22:3-4, (349-370), Online publication date: 1-Dec-2006.
- Coti C, Herault T, Lemarinier P, Pilard L, Rezmerita A, Rodriguez E and Cappello F Blocking vs. non-blocking coordinated checkpointing for large-scale fault tolerant MPI Proceedings of the 2006 ACM/IEEE conference on Supercomputing, (127-es)
- Garg R and Sabharwal Y Software routing and aggregation of messages to optimize the performance of HPCC randomaccess benchmark Proceedings of the 2006 ACM/IEEE conference on Supercomputing, (109-es)
- Chrisochoides N, Fedorov A, Kot A, Archip N, Black P, Clatz O, Golby A, Kikinis R and Warfield S Toward real-time image guided neurosurgery using distributed and grid computing Proceedings of the 2006 ACM/IEEE conference on Supercomputing, (76-es)
- Baumann R, Engelmann C and Geist A A parallel plug-in programming paradigm Proceedings of the Second international conference on High Performance Computing and Communications, (823-832)
- Kertész A, Sipos G and Kacsuk P Brokering multi-grid workflows in the P-GRADE portal Proceedings of the CoreGRID 2006, UNICORE Summit 2006, Petascale Computational Biology and Bioinformatics conference on Parallel processing, (138-149)
- Karátson J, Kurics T and Lirkov I A parallel algorithm for systems of convection-diffusion equations Proceedings of the 6th international conference on Numerical methods and applications, (65-73)
- Bosque J, Robles O, Pastor L and Rodríguez A (2006). Parallel CBIR implementations with load balancing algorithms, Journal of Parallel and Distributed Computing, 66:8, (1062-1075), Online publication date: 1-Aug-2006.
- El-Ghazawi T, Cantonnet F, Yao Y, Annareddy S and Mohamed A (2006). Benchmarking parallel compilers, Future Generation Computer Systems, 22:7, (764-775), Online publication date: 1-Aug-2006.
- Agbaria A, Kang D and Singh K LMPI Proceedings of the 12th International Conference on Parallel and Distributed Systems - Volume 1, (79-86)
- Wang S, Avrunin G and Clarke L Architectural building blocks for plug-and-play system design Proceedings of the 9th international conference on Component-Based Software Engineering, (98-113)
- Kolberg M, Baldo L, Velho P, Fernandes L and Claudio D Optimizing a parallel self-verified method for solving linear systems Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing, (949-955)
- Marques O and Vasconcelos P Evaluation of linear solvers for astrophysics transfer problems Proceedings of the 7th international conference on High performance computing for computational science, (466-475)
- Rasúa R, Vidal A and García V Parallel optimization methods based on direct search Proceedings of the 6th international conference on Computational Science - Volume Part I, (324-331)
- Li T, Guan X, Yu Z and Xue W Computation of si nanowire bandstructures on parallel machines through domain decomposition Proceedings of the 6th international conference on Computational Science - Volume Part I, (250-257)
- Chavarría-Miranda D, Nieplocha J and Tipparaju V Topology-aware tile mapping for clusters of SMPs Proceedings of the 3rd conference on Computing frontiers, (383-392)
- Park M and Jun Y Detecting unaffected message races in parallel programs Proceedings of the First international conference on Advances in Grid and Pervasive Computing, (187-196)
- Kohl J, Wilde T and Bernholdt D (2006). Cumulvs, International Journal of High Performance Computing Applications, 20:2, (255-285), Online publication date: 1-May-2006.
- Vetter J, Alam S, Dunigan T, Fahey M, Roth P and Worley P Early evaluation of the cray XT3 Proceedings of the 20th international conference on Parallel and distributed processing, (64-64)
- Corrêa M, Zorzo A and Scheer R Operating system multilevel load balancing Proceedings of the 2006 ACM symposium on Applied computing, (1467-1471)
- Giannetti F, Fernandes L, Timmers R, Nunes T, Raeder M and Castro M High performance XSL-FO rendering for variable data printing Proceedings of the 2006 ACM symposium on Applied computing, (811-817)
- Korch M and Rauber T (2006). Optimizing locality and scalability of embedded Runge--Kutta solvers using block-based pipelining, Journal of Parallel and Distributed Computing, 66:3, (444-468), Online publication date: 1-Mar-2006.
- Le-Khac N Studying the performance of overlapping communication and computation by active message Proceedings of the 24th IASTED international conference on Parallel and distributed computing and networks, (256-261)
- Lastovetsky A and Reddy R (2006). HeteroMPI, Journal of Parallel and Distributed Computing, 66:2, (197-220), Online publication date: 1-Feb-2006.
- Birchal M, Vale M and Visacro S PENCAPS Proceedings of the 12th international conference on High Performance Computing, (95-105)
- Micikevicius P and Deo N (2005). Cluster Computing for Determining Three-Dimensional Protein Structure, The Journal of Supercomputing, 34:3, (243-271), Online publication date: 1-Dec-2005.
- Michailidis P, Stefanidis V and Margaritis K Performance analysis of overheads for matrix – vector multiplication in cluster environment Proceedings of the 10th Panhellenic conference on Advances in Informatics, (245-255)
- García P, Fernández J, Petrini F and García J Assessing MPI performance on QsNetIIt Proceedings of the 12th European PVM/MPI users' group conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface, (399-406)
- Kwedlo W Parallelizing evolutionary algorithms for clustering data Proceedings of the 6th international conference on Parallel Processing and Applied Mathematics, (430-438)
- Singer D and Vagner A Parallel resolution of the satisfiability problem (SAT) with OpenMP and MPI Proceedings of the 6th international conference on Parallel Processing and Applied Mathematics, (380-388)
- Korneev V, Vshivkov V, Lazareva G and Kedrinskii V The parallel implementation of the algorithm solution of model for two-phase cluster in liquids Proceedings of the 8th international conference on Parallel Computing Technologies, (433-445)
- Castro M, Baldo L, Fernandes L, Raeder M and Velho P A parallel version for the propagation algorithm Proceedings of the 8th international conference on Parallel Computing Technologies, (403-412)
- Dormido Canto S, de Madrid A and Bencomo S (2005). Parallel Dynamic Programming on Clusters of Workstations, IEEE Transactions on Parallel and Distributed Systems, 16:9, (785-798), Online publication date: 1-Sep-2005.
- Zhang Y, Wong D and Zheng W (2005). User-level checkpoint and recovery for LAM/MPI, ACM SIGOPS Operating Systems Review, 39:3, (72-81), Online publication date: 1-Jul-2005.
- Low T, van de Geijn R and Van Zee F Extracting SMP parallelism for dense linear algebra algorithms from high-level specifications Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming, (153-163)
- Brodsky A, Pedersen J and Wagner A (2005). On the complexity of buffer allocation in message passing systems, Journal of Parallel and Distributed Computing, 65:6, (692-713), Online publication date: 1-Jun-2005.
- Li G and Liu D Key technologies research on building a cluster-based parallel computing system for remote sensing Proceedings of the 5th international conference on Computational Science - Volume Part III, (484-491)
- Engelmann C and Geist A Super-Scalable algorithms for computing on 100,000 processors Proceedings of the 5th international conference on Computational Science - Volume Part I, (313-321)
- Nool M and Proot M (2005). A parallel least-squares spectral element solver for incompressible flow problems on unstructured grids, Parallel Computing, 31:5, (414-438), Online publication date: 1-May-2005.
- Guermouche A and L'Excellent J A Study of Various Load Information Exchange Mechanisms for a Distributed Application using Dynamic Scheduling Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
- Beaumont O, Marchal L and Robert Y Broadcast Trees for Heterogeneous Platforms Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
- Beaumont O, Legrand A, Marchal L and Robert Y (2005). Pipelining Broadcasts on Heterogeneous Platforms, IEEE Transactions on Parallel and Distributed Systems, 16:4, (300-313), Online publication date: 1-Apr-2005.
- Taha T and Xu X (2005). Parallel Split-Step Fourier Methods for the Coupled Nonlinear Schrödinger Type Equations, The Journal of Supercomputing, 32:1, (5-23), Online publication date: 1-Apr-2005.
- Michailidis P and Margaritis K (2005). New Processor Array Architectures for the Longest Common Subsequence Problem, The Journal of Supercomputing, 32:1, (51-69), Online publication date: 1-Apr-2005.
- Cappello F, Djilali S, Fedak G, Herault T, Magniette F, Néri V and Lodygensky O (2005). Computing on large-scale distributed systems, Future Generation Computer Systems, 21:3, (417-437), Online publication date: 1-Mar-2005.
- Kommineni J and Abramson D GriddLeS enhancements and building virtual applications for the GRID with legacy components Proceedings of the 2005 European conference on Advances in Grid Computing, (961-971)
- Mizutani Y, Ino F and Hagihara K Fast performance prediction of master-slave programs by partial task execution Proceedings of the 4th WSEAS International Conference on Software Engineering, Parallel & Distributed Systems, (1-7)
- Liao W, Choudhary A, Weiner D and Varshney P (2005). Performance Evaluation of a Parallel Pipeline Computational Model for Space-Time Adaptive Processing, The Journal of Supercomputing, 31:2, (137-160), Online publication date: 1-Feb-2005.
- O'Cearbhaill E and O'Mahony M (2005). Parallel implementation of a transportation network model, Journal of Parallel and Distributed Computing, 65:1, (1-14), Online publication date: 1-Jan-2005.
- Mohror K and Karavanic K Performance Tool Support for MPI-2 on Linux Proceedings of the 2004 ACM/IEEE conference on Supercomputing
- Dotsenko Y, Coarfa C, Mellor-Crummey J and Chavarría-Miranda D Experiences with co-array fortran on hardware shared memory platforms Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing, (332-347)
- Yang J, Chen H, Kim B, Hariri S and Parashar M Autonomic runtime system for large scale parallel and distributed applications Proceedings of the 2004 international conference on Unconventional Programming Paradigms, (297-311)
- Lirkov I Parallel performance of a 3d elliptic solver Proceedings of the Third international conference on Numerical Analysis and its Applications, (383-390)
- González J, León C and Rodríguez C A distributed divide and conquer skeleton Proceedings of the 7th international conference on Applied Parallel Computing: state of the Art in Scientific Computing, (481-489)
- Vasconcelos P and d'Almeida F Performance evaluation of a parallel algorithm for a radiative transfer problem Proceedings of the 7th international conference on Applied Parallel Computing: state of the Art in Scientific Computing, (864-871)
- Dekel E and Goft G (2004). ITRA, The Journal of Supercomputing, 28:1, (43-70), Online publication date: 1-Apr-2004.
- Luong P, Breshears C and Ly L (2004). Application of Multiblock Grid and Dual-Level Parallelism in Coastal Ocean Circulation Modeling, Journal of Scientific Computing, 20:2, (257-277), Online publication date: 1-Apr-2004.
- Barker K, Chernikov A, Chrisochoides N and Pingali K (2004). A Load Balancing Framework for Adaptive and Asynchronous Applications, IEEE Transactions on Parallel and Distributed Systems, 15:2, (183-192), Online publication date: 1-Feb-2004.
- References Grid resource management, (507-566)
- Hill C, DeLuca C, Balaji V, Suarez M and Silva A (2004). The Architecture of the Earth System Modeling Framework, Computing in Science and Engineering, 6:1, (18-28), Online publication date: 1-Jan-2004.
- Kaiser T (2003). A methodology for creating large modules, ACM SIGPLAN Fortran Forum, 22:3, (11-24), Online publication date: 1-Dec-2003.
- Bouteiller A, Cappello F, Herault T, Krawezik G, Lemarinier P and Magniette F MPICH-V2 Proceedings of the 2003 ACM/IEEE conference on Supercomputing
- Pérez C, Priol T and Ribes A (2003). A Parallel Corba Component Model for Numerical Code Coupling, International Journal of High Performance Computing Applications, 17:4, (417-429), Online publication date: 1-Nov-2003.
- Chan F, Cao J and Sun Y (2003). High-level abstractions for message-passing parallel programming, Parallel Computing, 29:11-12, (1589-1621), Online publication date: 1-Nov-2003.
- Tan K, Szafron D, Schaeffer J, Anvik J and MacDonald S (2003). Using generative design patterns to generate parallel code for a distributed memory environment, ACM SIGPLAN Notices, 38:10, (203-215), Online publication date: 1-Oct-2003.
- Deitz S, Chamberlain B, Choi S and Snyder L (2003). The design and implementation of a parallel array operator for the arbitrary remapping of data, ACM SIGPLAN Notices, 38:10, (155-166), Online publication date: 1-Oct-2003.
- Chu L, Tang H, Yang T and Shen K (2003). Optimizing data aggregation for cluster-based internet services, ACM SIGPLAN Notices, 38:10, (119-130), Online publication date: 1-Oct-2003.
- Michailidis P and Margaritis K (2003). Performance evaluation of load balancing strategies for approximate string matching application on an MPI cluster of heterogeneous workstations, Future Generation Computer Systems, 19:7, (1075-1104), Online publication date: 1-Oct-2003.
- Amestoy P, Duff I, L'Excellent J and Li X (2003). Impact of the implementation of MPI point-to-point communications on the performance of two general sparse solvers, Parallel Computing, 29:7, (833-849), Online publication date: 1-Jul-2003.
- Tan K, Szafron D, Schaeffer J, Anvik J and MacDonald S Using generative design patterns to generate parallel code for a distributed memory environment Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming, (203-215)
- Deitz S, Chamberlain B, Choi S and Snyder L The design and implementation of a parallel array operator for the arbitrary remapping of data Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming, (155-166)
- Chu L, Tang H, Yang T and Shen K Optimizing data aggregation for cluster-based internet services Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming, (119-130)
- Krawezik G Performance comparison of MPI and three openMP programming styles on shared memory multiprocessors Proceedings of the fifteenth annual ACM symposium on Parallel algorithms and architectures, (118-127)
- Gava F and Loulergue F A parallel virtual machine for bulk synchronous parallel ML Proceedings of the 1st international conference on Computational science: PartI, (155-164)
- Saywell M and Reeve J JDOS Proceedings of the 2003 international conference on Computational science: PartIII, (570-580)
- Simpson D and Reeve J Deadlock free specification based on local process properties Proceedings of the 2003 international conference on Computational science: PartIII, (350-359)
- Alves A, Pina A, Exposto J and Rufino J ToCL Proceedings of the 2003 international conference on Computational science: PartII, (1022-1031)
- Rubio F and Rodríguez I A parallel framework for computational science Proceedings of the 2003 international conference on Computational science: PartII, (1002-1011)
- Bellucci D, Tasso S and Laganà A Parallel models for a discrete variable wavepacket propagation Proceedings of the 2003 international conference on Computational science: PartII, (341-349)
- Standish R, Chee C and Smeds N OpenMP in the field Proceedings of the 2003 international conference on Computational science, (637-647)
- Boeres C and Rebello V (2003). Towards Optimal Static Task Scheduling for Realistic Machine Models, International Journal of High Performance Computing Applications, 17:2, (173-189), Online publication date: 1-May-2003.
- Berman F, Wolski R, Casanova H, Cirne W, Dail H, Faerman M, Figueira S, Hayes J, Obertelli G, Schopf J, Shao G, Smallen S, Spring N, Su A and Zagorodnov D (2003). Adaptive Computing on the Grid Using AppLeS, IEEE Transactions on Parallel and Distributed Systems, 14:4, (369-382), Online publication date: 1-Apr-2003.
- Aiex R, Binato S and Resende M (2003). Parallel GRASP with path-relinking for job shop scheduling, Parallel Computing, 29:4, (393-430), Online publication date: 1-Apr-2003.
- Guibault F, Roy R, Laflamme S and Dompierre J (2003). Applying Parmetis to Structured Remeshing for Industrial CFD Applications, International Journal of High Performance Computing Applications, 17:1, (63-76), Online publication date: 1-Feb-2003.
- Dongarra J, Foster I, Fox G, Gropp W, Kennedy K, Torczon L and White A References Sourcebook of parallel computing, (729-789)
- Pllana S and Fahringer T Parallel and distributed systems Proceedings of the 34th conference on Winter simulation: exploring new frontiers, (497-505)
- MacDonald S, Anvik J, Bromling S, Schaeffer J, Szafron D and Tan K (2002). From patterns to frameworks to parallel programs, Parallel Computing, 28:12, (1663-1683), Online publication date: 1-Dec-2002.
- Vetter J and Yoo A An empirical performance evaluation of scalable scientific applications Proceedings of the 2002 ACM/IEEE conference on Supercomputing, (1-18)
- Träff J Implementing the MPI process topology mechanism Proceedings of the 2002 ACM/IEEE conference on Supercomputing, (1-14)
- Lastovetsky A (2002). Adaptive parallel computing on heterogeneous networks with mpC, Parallel Computing, 28:10, (1369-1407), Online publication date: 1-Oct-2002.
- Natrajan A, Humphrey M and Grimshaw A (2002). The Legion support for advanced parameter-space studies on a grid, Future Generation Computer Systems, 18:8, (1033-1052), Online publication date: 1-Oct-2002.
- Chavarría-Miranda D and Mellor-Crummey J An Evaluation of Data-Parallel Compiler Support for Line-Sweep Applications Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques, (7-17)
- Gray P and Sunderam V (2002). Collaborative Metacomputing with IceT, The Journal of Supercomputing, 23:2, (139-166), Online publication date: 1-Sep-2002.
- Huang C and Nof S (2002). Evaluation of agent-based manufacturing systems based on a parallel simulator, Computers and Industrial Engineering, 43:3, (529-552), Online publication date: 1-Sep-2002.
- Duato J, Yalamanchili S and Ni L (2002). Interconnection Networks, 10.5555/2821578, Online publication date: 6-Aug-2002.
- Deitz S, Chamberlain B and Snyder L (2002). High-level Language Support for User-defined Reductions, The Journal of Supercomputing, 23:1, (23-37), Online publication date: 1-Aug-2002.
- Wallcraft A (2002). A Comparison of Co-Array Fortran and OpenMP Fortran for SPMD Programming, The Journal of Supercomputing, 22:3, (231-250), Online publication date: 1-Jul-2002.
- Rufino J, Pina A, Alves A and Exposto J Distributed paged hash tables Proceedings of the 5th international conference on High performance computing for computational science, (679-693)
- Alves A, Pina A, Exposto J and Rufino J Scalable multithreading in a low latency Myrinet cluster Proceedings of the 5th international conference on High performance computing for computational science, (579-593)
- Caballer M, Guerrero D, Hernández V and Román J A parallel rendering algorithm based on hierarchical radiosity Proceedings of the 5th international conference on High performance computing for computational science, (523-536)
- Nool M and Proot M A parallel, state-of-the-art, least-squares spectral element solver for incompressible flow problems Proceedings of the 5th international conference on High performance computing for computational science, (39-52)
- Jin G and Mellor-Crummey J Experiences tuning SMG98 Proceedings of the 16th international conference on Supercomputing, (305-314)
- Vetter J Dynamic statistical profiling of communication activity in distributed applications Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, (240-250)
- Vetter J (2002). Dynamic statistical profiling of communication activity in distributed applications, ACM SIGMETRICS Performance Evaluation Review, 30:1, (240-250), Online publication date: 1-Jun-2002.
- Baude F, Caromel D, Furmento N and Sagnol D (2002). Optimizing remote method invocation with communication-computation overlap, Future Generation Computer Systems, 18:6, (769-778), Online publication date: 1-May-2002.
- Wilde T, Kohl J and Flanery R Integrating CUMULVS into AVS/Express Proceedings of the International Conference on Computational Science-Part II, (864-873)
- Engelmann C, Scott S and Geist G Distributed Peer-to-Peer Control in Harness Proceedings of the International Conference on Computational Science-Part II, (720-728)
- Taufer M, Perathoner E, Cavalli A, Caflisch A and Stricker T Performance Characterization of a Molecular Dynamics Code on PC Clusters Proceedings of the 16th International Parallel and Distributed Processing Symposium
- Gascard E and Pierre L Mechanical Verification of Hypercube Algorithms Proceedings of the 16th International Parallel and Distributed Processing Symposium
- Vetter J and Mueller F Communication Characteristics of Large-Scale Scientific Applications for Contemporary Cluster Architectures Proceedings of the 16th International Parallel and Distributed Processing Symposium
- Swann C (2002). Maximum Likelihood Estimation Using Parallel Computing, Computational Economics, 19:2, (145-178), Online publication date: 1-Apr-2002.
- Hadjidoukas P, Polychronopoulos E and Papatheodorou T Integrating MPI and nanothreads programming model Proceedings of the 10th Euromicro conference on Parallel, distributed and network-based processing, (309-316)
- van der Steen A and Dongarra J Overview of high performance computers Handbook of massive data sets, (791-852)
- Luong P, Breshears C and Ly L Coastal ocean modeling of the U.S. west coast with multiblock grid and dual-level parallelism Proceedings of the 2001 ACM/IEEE conference on Supercomputing, (9-9)
- Jipping M and Lewandowski G Parallel processing over mobile ad hoc networks of handheld machines Proceedings of the 2nd ACM international symposium on Mobile ad hoc networking & computing, (267-270)
- Beaumont O, Boudet V, Rastello F and Robert Y (2001). Matrix Multiplication on Heterogeneous Platforms, IEEE Transactions on Parallel and Distributed Systems, 12:10, (1033-1051), Online publication date: 1-Oct-2001.
- Kesavan R and Panda D (2001). Efficient Multicast on Irregular Switch-Based Cut-Through Networks with Up-Down Routing, IEEE Transactions on Parallel and Distributed Systems, 12:8, (808-828), Online publication date: 1-Aug-2001.
- Vetter J and McCracken M (2001). Statistical scalability analysis of communication operations in distributed applications, ACM SIGPLAN Notices, 36:7, (123-132), Online publication date: 1-Jul-2001.
- Vetter J and McCracken M Statistical scalability analysis of communication operations in distributed applications Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming, (123-132)
- Kim J, Kim K and Jung S Building a high-performance communication layer over virtual interface architecture on Linux clusters Proceedings of the 15th international conference on Supercomputing, (335-347)
- Demaine E, Foster I, Kesselman C and Snir M (2001). Generalized Communicators in the Message Passing Interface, IEEE Transactions on Parallel and Distributed Systems, 12:6, (610-616), Online publication date: 1-Jun-2001.
- Gondzio J, Sarkissian R and Vial J (2001). Parallel Implementation of a Central Decomposition Method for Solving Large-Scale Planning Problems, Computational Optimization and Applications, 19:1, (5-29), Online publication date: 1-Apr-2001.
- Bohossian V, Fan C, LeMahieu P, Riedel M, Bruck J and Xu L (2001). Computing in the RAIN, IEEE Transactions on Parallel and Distributed Systems, 12:2, (99-114), Online publication date: 1-Feb-2001.
- Baker C, Watson L, Grossman B, Mason W and Haftka R Parallel global aircraft configuration design space exploration Practical parallel computing, (79-96)
- Smith J, Watson P, Sampaio S and Paton N Polar Proceedings of the ninth international conference on Information and knowledge management, (352-359)
- Vetter J and de Supinski B Dynamic software testing of MPI applications with umpire Proceedings of the 2000 ACM/IEEE conference on Supercomputing, (51-es)
- Vadhiyar S, Fagg G and Dongarra J Automatically tuned collective communications Proceedings of the 2000 ACM/IEEE conference on Supercomputing, (3-es)
- Cramer C and Board J The development and integration of a distributed 3D FFT for a cluster of workstations Proceedings of the 4th annual Linux Showcase & Conference - Volume 4, (26-26)
- Nikolopoulos D, Papatheodorou T, Polychronopoulos C, Labarta J and Ayguad\'{e} E (2000). A transparent runtime data distribution engine for OpenMP, Scientific Programming, 8:3, (143-162), Online publication date: 1-Aug-2000.
- Tang H, Shen K and Yang T (2000). Program transformation and runtime support for threaded MPI execution on shared-memory machines, ACM Transactions on Programming Languages and Systems, 22:4, (673-700), Online publication date: 1-Jul-2000.
- Vetter J Performance analysis of distributed applications using automatic classification of communication inefficiencies Proceedings of the 14th international conference on Supercomputing, (245-254)
- Bougé L, Méhaut J, Namyst R and Prylli L Using the VI architecture to build distributed, multithreaded runtime systems Proceedings of the 2000 ACM symposium on Applied computing - Volume 2, (704-709)
- Rudra A and Gopalan R Adaptive use of a cluster of PCs for data warehousing applications Proceedings of the 2000 ACM symposium on Applied computing - Volume 2, (698-703)
- Rao S, Alvisi L and Vin H (2000). The Cost of Recovery in Message Logging Protocols, IEEE Transactions on Knowledge and Data Engineering, 12:2, (160-173), Online publication date: 1-Mar-2000.
- Gray P and Sunderam V (1999). Metacomputing with the ICET System, International Journal of High Performance Computing Applications, 13:3, (241-252), Online publication date: 1-Aug-1999.
- Tang H, Shen K and Yang T (1999). Compile/run-time support for threaded MPI execution on multiprogrammed shared memory machines, ACM SIGPLAN Notices, 34:8, (107-118), Online publication date: 1-Aug-1999.
- Baptist L and Cormen T Multidimensional, multiprocessor, out-of-core FFTs with distributed memory and parallel disks (extended abstract) Proceedings of the eleventh annual ACM symposium on Parallel algorithms and architectures, (242-250)
- Tang H, Shen K and Yang T Compile/run-time support for threaded MPI execution on multiprogrammed shared memory machines Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming, (107-118)
- Banikazemi M, Sampathkumar J, Prabhu S, Panda D and Sadayappan P Communication Modeling of Heterogeneous Networks of Workstations for Performance Characterization of Collective Operations Proceedings of the Eighth Heterogeneous Computing Workshop
- Dubois-Pelerin Y, Van Kemenade V and Deville M (1999). An Object-Oriented Toolbox for Spectral Element Analysis, Journal of Scientific Computing, 14:1, (1-29), Online publication date: 1-Mar-1999.
- Shen K, Tang H and Yang T Adaptive two-level thread management for fast MPI execution on shared memory machines Proceedings of the 1999 ACM/IEEE conference on Supercomputing, (49-es)
- Omori Y, Fukuda A and Joe K (1999). An Object-Oriented Framework for Loop Parallelization, The Journal of Supercomputing, 13:1, (57-69), Online publication date: 1-Jan-1999.
- Saltz J, Sussman A, Graham S, Demmel J, Baden S and Dongarra J (1998). Programming tools and environments, Communications of the ACM, 41:11, (64-73), Online publication date: 1-Nov-1998.
- Culler D, Singh J and Gupta A (1998). Parallel Computer Architecture, 10.5555/2821564, Online publication date: 29-Sep-1998.
- Osawa N An enhanced 3-D animation tool for performance tuning of parallel programs based on dynamic models Proceedings of the SIGMETRICS symposium on Parallel and distributed tools, (72-80)
- Kohl J and Papadopoulas P Efficient and flexible fault tolerance and migration of scientific simulations using CUMULVS Proceedings of the SIGMETRICS symposium on Parallel and distributed tools, (60-71)
- Tatebe O, Kodama Y, Sekiguchi S and Yamaguchi Y Highly efficient implementation of MPI point-to-point communication using remote memory operations Proceedings of the 12th international conference on Supercomputing, (267-273)
- Casanova H and Dongarra J (1998). Applying NetSolve's Network-Enabled Server, IEEE Computational Science & Engineering, 5:3, (57-67), Online publication date: 1-Jul-1998.
- Hung C, YarKhan A, Wong K, von Laven S and Coleman T Parallel implementation of an integrated edge-preserving smoothing algorithm in clusters of workstations Proceedings of the 36th annual ACM Southeast Conference, (20-22)
- Moreira J and Midkiff S (1998). Fortran 90 in CSE, IEEE Computational Science & Engineering, 5:2, (39-49), Online publication date: 1-Apr-1998.
- Brunett S, Davis D, Gottschalk T, Messina P and Kesselman C Implementing Distributed Synthetic Forces Simulations in Metacomputing Environments Proceedings of the Seventh Heterogeneous Computing Workshop
- Casanova H and Dongarra J NetSolve Proceedings of the Seventh Heterogeneous Computing Workshop
- Jagannathan S and Kelsey R On the Interaction of Mobile Processes and Objects Proceedings of the Seventh Heterogeneous Computing Workshop
- Watts J and Taylor S (1998). A Practical Approach to Dynamic Load Balancing, IEEE Transactions on Parallel and Distributed Systems, 9:3, (235-248), Online publication date: 1-Mar-1998.
- Desprez F, Randriamaro C, Dongarra J, Petitet A and Robert Y (1998). Scheduling Block-Cyclic Array Redistribution, IEEE Transactions on Parallel and Distributed Systems, 9:2, (192-205), Online publication date: 1-Feb-1998.
- Cormen T, Wegmann J and Nicol D Multiprocessor out-of-core FFTs with distributed memory and parallel disks (extended abstract) Proceedings of the fifth workshop on I/O in parallel and distributed systems, (68-78)
- Alpatov P, Baker G, Edwards C, Gunnels J, Morrow G, Overfelt J, van de Geijn R and Wu Y PLAPACK Proceedings of the 1997 ACM/IEEE conference on Supercomputing, (1-16)
- Nucciarone J, Özyörük Y and Long L New life in dusty decks Proceedings of the 1997 ACM/IEEE conference on Supercomputing, (1-19)
- Fiedler R Optimization and scaling of shared-memory and message-passing implementations of the ZEUS hydrodynamics algorithm Proceedings of the 1997 ACM/IEEE conference on Supercomputing, (1-16)
- Berry M and Minser K Distributed land-cover change simulation Proceedings of the 5th ACM international workshop on Advances in geographic information systems, (67-70)
- Goddard N, Hood G, Cohen J, Eddy W, Genovese C, Noll D and Nystrom L (1997). Online Analysis of Functional MRI Datasets on Parallel Platforms, The Journal of Supercomputing, 11:3, (295-318), Online publication date: 1-Nov-1997.
- Ma K and Crockett T A scalable parallel cell-projection volume rendering algorithm for three-dimensional unstructured data Proceedings of the IEEE symposium on Parallel rendering, (95-ff.)
- Kulaczewski M and Siegel H Implementations of a Feature-Based Visual Tracking Algorithm on Two MIMD Machines Mark Bernd Proceedings of the international Conference on Parallel Processing, (422-430)
- van Gemund A The importance of synchronization structure in parallel program optimization Proceedings of the 11th international conference on Supercomputing, (164-171)
- Kim Y, Plank J and Dongarra J Fault Tolerant Matrix Operations for Networks of Workstations Using Multiple Checkpointing Proceedings of the High-Performance Computing on the Information Superhighway, HPC-Asia '97
- Edjlali G, Sussman A and Saltz J Interoperability of Data Parallel Runtime Libraries Proceedings of the 11th International Symposium on Parallel Processing, (451-459)
- Murthy V and Krishnamurthy E Heterogeneous programming with concurrent objects Proceedings of the 1997 ACM symposium on Applied computing, (454-463)
- Silva C, Kaufman A and Pavlakos C (1996). PVR, IEEE Computational Science & Engineering, 3:4, (18-28), Online publication date: 1-Dec-1996.
- Carrillo A, Horner D, Peters J and West J Design of a large scale discrete element soil model for high performance computing systems Proceedings of the 1996 ACM/IEEE conference on Supercomputing, (51-es)
- Blackford L, Choi J, Cleary A, Petitet A, Whaley R, Demmel J, Dhillon I, Stanley K, Dongarra J, Hammarling S, Henry G and Walker D ScaLAPACK Proceedings of the 1996 ACM/IEEE conference on Supercomputing, (5-es)
- Reschke C, Sterling T, Becker D, Merkey P and Savarese D A Design Study of Alternative Network Topologies for the Beowulf Parallel Workstation Proceedings of the 5th IEEE International Symposium on High Performance Distributed Computing
- Dongarra J, Otto S, Snir M and Walker D (1996). A message passing standard for MPP and workstations, Communications of the ACM, 39:7, (84-90), Online publication date: 1-Jul-1996.
- Ludwig S Running krill herd algorithm on Hadoop: A performance study 2016 IEEE Congress on Evolutionary Computation (CEC), (2504-2510)
Recommendations
MPI + MPI: a new hybrid approach to parallel programming with MPI plus shared memory
Hybrid parallel programming with the message passing interface (MPI) for internode communication in conjunction with a shared-memory programming model to manage intranode parallelism has become a dominant approach to scalable parallel programming. While ...
MPI: past, present and future
PVM/MPI'07: Proceedings of the 14th European conference on Recent Advances in Parallel Virtual Machine and Message Passing InterfaceThis talk will trace the origins of MPI from the early message-passing, distributed memory, parallel computers in the 1980's, to today's parallel supercomputers. In these early days, parallel computing companies implemented proprietary message-passing ...
MPI-StarT: delivering network performance to numerical applications
SC '98: Proceedings of the 1998 ACM/IEEE conference on SupercomputingWe describe an MPI implementation for a cluster of SMPs interconnected by a high-performance interconnect. This work is a collaboration between a numerical applications programmer and a cluster interconnect architect. The collaboration started with the ...