Article

Free access

Integrating message-passing and shared-memory: early experience

Authors:

John Kubiatowicz,

Beng-Hong LimAuthors Info & Claims

PPOPP '93: Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming

Pages 54 - 63

https://doi.org/10.1145/155332.155338

Published: 01 July 1993 Publication History

Abstract

This paper discusses some of the issues involved in implementing a shared-address space programming model on large-scale, distributed-memory multiprocessors. While such a programming model can be implemented on both shared-memory and message-passing architectures, we argue that the transparent, coherent caching of global data provided by many shared-memory architectures is of crucial importance. Because message-passing mechanisms ar much more efficient than shared-memory loads and stores for certain types of interprocessor communication and synchronization operations, hwoever, we argue for building multiprocessors that efficiently support both shared-memory and message-passing mechnisms. We describe an architecture, Alewife, that integrates support for shared-memory and message-passing through a simple interface; we expect the compiler and runtime system to cooperate in using appropriate hardware mechanisms that are most efficient for specific operations. We report on both integrated and exclusively shared-memory implementations of our runtime system and two applications. The integrated runtime system drastically cuts down the cost of communication incurred by the scheduling, load balancing, and certain synchronization operations. We also present preliminary performance results comparing the two systems.

References

[1]

Anant Agarwal, David Chaiken, Godfrey D'Souza, Kirk Johnson, David Kranz, John Kubiatowicz, Kiyoshi Kurihara, Beng-Hong Lim, Gino Maa, Dan Nussbaum, Mike Parkin, and Donald Yeung. The MIT Alewife Machine: A Large-Scale Distributed-Memory Multiprocessor, in Proceedings of Workshop on Scalable Shared Memory Multiprocessors. Kluwer Academic Publishers, 1991. An extended version of this paper has been submitted for publication, and appears as MIT/LCS Memo TM-454, 1991.

Digital Library

[2]

Bob Beck, Bob Kasten, and Shreekant Thakkar. VLSI Assist for a Multiprocessor. In Proceedings Second International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS II), Washington, DC, October 1987. IEEE.

Digital Library

[3]

Daniel G. Bobrow, Linda G. DeMichiel, Richard P. Gabriel, Sonya E. Keene, Gregor Kiczales, and David A. Moon. Common Lisp Object System Specification. ACM SIGPLAN Notices, 23, September 1988.

Digital Library

[4]

S. Borkar, R. Cohn, G. Cox, T. Gross, H.T. Kung, M. Lam, M. Levine, B. Moore, W. Moore, C. Peterson, J. Susman, J. Sutton, J. Urbanski, and J. Webb. Supporting Systolic and Memory Communication in iWarp. In Proceedings of the 17th Annual International Symposium on Computer Architecture, pages 70-8 I, June 1990.

Digital Library

[5]

David Callahan and Ken Kennedy. Compiling Programs for Distributed-Memory Multiprocessors. Journal of Supercomputing, 2(151-169), October 1988.

[6]

David Chaiken, John Kubiatowicz, and Anant Agarwal. LimitLESS Directories: A Scalable Cache Coherence Scheme. In Fourth international Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS IV), pages 224-234. ACM, April 199 I.

Digital Library

[7]

A. Cox and R. Fowler. The Implementation of a Coherent Memory Abstraction on a NUMA Multiprocessor: Experiences with PLAT- INUM. In Proceedings of the 12th ACM Symposium on Operating Systems Principles, pages 32-44, December 1989. Also as a Univ. Rochester TR-263, May 1989.

Digital Library

[8]

William J. Dally et al. The J-Machine: A Fine-Grain Concurrent Computer. In IFIP Congress, 1989.

[9]

Thomas H. Dunigan. Kendall Square Multiprocessor: Early Experiences and Performance. Technical Report ORNL/TM-12065, Oak Ridge National Laboratory, March 1992.

[10]

K. Gharachorloo, D. Lenoski, J. Laudon, P. Gibbons, A. Gupta, and J. Hennessy. Memory Consistency and Event Ordering in Scalable Shared-Memory Multiprocessors. In Proceedings 17th Annual International Symposium on Computer Architecture, New York, June 1990. IEEE.

Digital Library

[11]

K. Knobe, J. Lukas, and G. Steele Jr. Data Optimization: Allocation of Arrays to Reduce Communication on SIMD Machines. Journal of Parallel and Distributed Computing, 8(2): 102-118, 1990.

Digital Library

[12]

John Kubiatowicz and Anant Agarwal. Anatomy of a Message in the Alewife Multiprocessor. Submitted for publication. Also available as MIT Laboratory for Computer Science Tech Memo, 1993.

Digital Library

[13]

D. Lenoski, J. Laudon, K. Gharachodoo, W. Weber, A. Gupta, J. Hennessy, M. Horowitz, and M. Lam. The Stanford Dash Multiprocessor. IEEE Computer, 25(3):63-79, March 1992.

Digital Library

[14]

j. Li and M. Chen. Compiling communication-efficient programs for massively parallel machines. IEEE Transactions on Parallel and Distributed Systems, 2:361-376, July 1991.

Digital Library

[15]

M. Martonosi and A. Gupta. Tradeoffs in Message Passing and Shared Memory Implementations of a Standard Cell Router. In Proceedings of the 1989 International Conference on Parallel Processing, pages III 88-96, 1989.

[16]

John M. Mellor-Crummey and Michael L. Scott. Algorithms for Scalable Synchronization on Shared-Memory Multiprocessors. ACM Transactions on Computer Systems, 9(1):21-65, February 1991.

Digital Library

[17]

E. Mohr, D. Kranz, and R. Halstead. Lazy Task Creation: A Technique for Increasing the Granularity of Parallel Programs. IEEE Transactions on Parallel and Distributed Systems, 2(3):264-280, July 1991.

Digital Library

[18]

J. Rees and N. Adams. T: A Dialect of LISP. In Proceedings of Symposium on Lisp and Functional Programming, August 1982.

[19]

J. Rees, N. Adams, and J. Meehan. The T Manual, Fourth Edition. Technical report, Yale University, Computer Science Department, January 1984.

[20]

Apple Computer Eastem Research and Technology. Dylan: an objectoriented dynamic language. Apple Computer, Inc., 1992.

[21]

Anne Rogers and Keshav Pingali. Process Decomposition through Locality of Reference. In SIGPLAN '89, Conference on Programming Language Design and Implementation, June 1989.

Digital Library

[22]

Alfred Z. Spector. Performing Remote Operations Efficiently on a Local Computer Network. Communications of the ACM, 25(4), April 1982. Pages 246-260.

Digital Library

[23]

Thorsten von Eicken, David Culler, Seth Goldstein, and Klaus Schauser. Active messages: A mechanism for integrated communication and computation. In 19th International Symposium on Computer Architecture, May 1992.

Digital Library

[24]

H. Zima, H.-J. Bast, and M. Gerndt. SUPERB: A tool for semiautomatic MIMD/SIMD paraUelization. Parallel Computing, 6(1), 1988.

Cited By

Rheindt SSabirov TLenke OWild THerkersdorf A(2020)X-Centric: A Survey on Compute-, Memory- and Application-Centric Computer ArchitecturesProceedings of the International Symposium on Memory Systems10.1145/3422575.3422792(178-193)Online publication date: 28-Sep-2020
https://dl.acm.org/doi/10.1145/3422575.3422792
Rheindt SMaier SPohle NNolte LLenke OSchmaus FWild TSchröder-Preikschat WHerkersdorf A(2020)DySHARQ: Dynamic Software-Defined Hardware-Managed Queues for Tile-Based ArchitecturesInternational Journal of Parallel Programming10.1007/s10766-020-00687-7Online publication date: 20-Nov-2020
https://doi.org/10.1007/s10766-020-00687-7
Aguilera MBen-David NCalciu IGuerraoui RPetrank EToueg SNewport CKeidar I(2018)Passing Messages while Sharing MemoryProceedings of the 2018 ACM Symposium on Principles of Distributed Computing10.1145/3212734.3212741(51-60)Online publication date: 23-Jul-2018
https://dl.acm.org/doi/10.1145/3212734.3212741
Show More Cited By

Index Terms

Recommendations

Where is time spent in message-passing and shared-memory programs?
ASPLOS VI: Proceedings of the sixth international conference on Architectural support for programming languages and operating systems

Message passing and shared memory are two techniques parallel programs use for coordination and communication. This paper studies the strengths and weaknesses of these two mechanisms by comparing equivalent, well-written message-passing and shared-...
Performance of hybrid message-passing and shared-memory parallelism for discrete element modeling
SC '00: Proceedings of the 2000 ACM/IEEE conference on Supercomputing

The current trend in HPC hardware is towards clusters of shared-memory (SMP) compute nodes. For applications developers the major question is how best to program these SMP clusters. To address this we study an algorithm from Discrete Element Modeling, ...
Integrating message-passing and shared-memory: early experience

This paper discusses some of the issues involved in implementing a shared-address space programming model on large-scale, distributed-memory multiprocessors. While such a programming model can be implemented on both shared-memory and message-passing ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

PPOPP '93: Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming

August 1993

259 pages

ISBN:0897915895

DOI:10.1145/155332

Chairmen:
Marina Chen
Yale Univ., New Haven, CT
,
Robert Halstead
DEC Cambridge Research Lab.

ACM SIGPLAN Notices Volume 28, Issue 7
July 1993
259 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/173284
Editor:
Richard Wexelblat
IDA/CSED, Alexandria, VA
Issue’s Table of Contents

Copyright © 1993 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGPLAN: ACM Special Interest Group on Programming Languages

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 July 1993

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Article

Conference

PPOPP93

Sponsor:

SIGPLAN

PPOPP93: Principles & Practices of Parallel Programming

May 19 - 22, 1993

California, San Diego, USA

Acceptance Rates

Overall Acceptance Rate 230 of 1,014 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

128
Total Citations
View Citations
847
Total Downloads

Downloads (Last 12 months)120
Downloads (Last 6 weeks)30

Reflects downloads up to 13 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Rheindt SSabirov TLenke OWild THerkersdorf A(2020)X-Centric: A Survey on Compute-, Memory- and Application-Centric Computer ArchitecturesProceedings of the International Symposium on Memory Systems10.1145/3422575.3422792(178-193)Online publication date: 28-Sep-2020
https://dl.acm.org/doi/10.1145/3422575.3422792
Rheindt SMaier SPohle NNolte LLenke OSchmaus FWild TSchröder-Preikschat WHerkersdorf A(2020)DySHARQ: Dynamic Software-Defined Hardware-Managed Queues for Tile-Based ArchitecturesInternational Journal of Parallel Programming10.1007/s10766-020-00687-7Online publication date: 20-Nov-2020
https://doi.org/10.1007/s10766-020-00687-7
Aguilera MBen-David NCalciu IGuerraoui RPetrank EToueg SNewport CKeidar I(2018)Passing Messages while Sharing MemoryProceedings of the 2018 ACM Symposium on Principles of Distributed Computing10.1145/3212734.3212741(51-60)Online publication date: 23-Jul-2018
https://dl.acm.org/doi/10.1145/3212734.3212741
Chen RWang YHu JLiu DShao ZGuan Y(2016)Image-Content-Aware I/O Optimization for Mobile VirtualizationACM Transactions on Embedded Computing Systems10.1145/295005916:1(1-24)Online publication date: 13-Oct-2016
https://dl.acm.org/doi/10.1145/2950059
Gu ZWang CZeng H(2016)Cache-Partitioned Preemption Threshold SchedulingACM Transactions on Embedded Computing Systems10.1145/295005716:1(1-30)Online publication date: 23-Oct-2016
https://dl.acm.org/doi/10.1145/2950057
Zhao ZShen X(2015)On-the-Fly Principled Speculation for FSM ParallelizationACM SIGARCH Computer Architecture News10.1145/2786763.269436943:1(619-630)Online publication date: 14-Mar-2015
https://dl.acm.org/doi/10.1145/2786763.2694369
Hicks MSturton CKing SSmith J(2015)SPECSACM SIGARCH Computer Architecture News10.1145/2786763.269436643:1(517-529)Online publication date: 14-Mar-2015
https://dl.acm.org/doi/10.1145/2786763.2694366
Harting RDally W(2015)On-Chip Active Messages for Speed, Scalability, and EfficiencyIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2014.230787426:2(507-515)Online publication date: Feb-2015
https://doi.org/10.1109/TPDS.2014.2307874
Kubiatowicz JAgarwal A(2014)Anatomy of a message in the Alewife multiprocessorACM International Conference on Supercomputing 25th Anniversary Volume10.1145/2591635.2667168(193-204)Online publication date: 10-Jun-2014
https://dl.acm.org/doi/10.1145/2591635.2667168
Garg VCamp L(2012)Gandhigiri in cyberspaceACM SIGCAS Computers and Society10.1145/2422512.242251442:1(9-20)Online publication date: 1-Aug-2012
https://dl.acm.org/doi/10.1145/2422512.2422514
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents