Article

Free access

Supporting systolic and memory communication in iWarp

Authors:

Shekhar Borkar,

Craig Peterson,

Jon WebbAuthors Info & Claims

ISCA '90: Proceedings of the 17th annual international symposium on Computer Architecture

Pages 70 - 81

https://doi.org/10.1145/325164.325116

Published: 01 May 1990 Publication History

Abstract

iWarp is a parallel architecture developed jointly by Carnegie Mellon University and Intel Corporation. The iWarp communication system supports two widely used interprocessor communication styles: memory communication and systolic communication. This paper describes the rationale, architecture, and implementation for the iWarp communication system.

The sending or receiving processor of a message can perform either memory or systolic communication. In memory communication, the entire message is buffered in the local memory of the processor before it is transmitted or after it is received. Therefore communication begins or terminates at the local memory. For conventional message passing methods, both sending and receiving processors use memory communication. In systolic communication, individual data items are transferred as they are produced, or are used as they are received, by the program running at the processor. Memory communication is flexible and well suited for general computing; whereas systolic communication is efficient and well suited for speed critical applications.

A major achievement of the iWarp effort is the derivation of a common design to satisfy the requirements of both systolic and memory communication styles. This is made possible by two important innovations in communication: (1) program access to communication and (2) logical channels. The former allows programs to access data as they are transmitted and to redirect portions of messages to different destinations efficiently. The latter increases the connectivity between the processors and guarantees communication bandwidth for classes of messages. These innovations have provided a focus for the iWarp architecture. The result is a communication system that provides a total bandwidth of 320 MBytes/sec and that is integrated on a single VLSI component with a 20 MFLOPS plus 20 MIPS long instruction word computation engine.

References

[1]

Annaratone, M., Bitz. F., Chme. E. Ktmg, H. T., Maul&, P., Ribas, H. Tseng, P. and Webb, J. Applications and Algorithm Partitioning on Warp. COMPCON Spring '87. IBEE Computer Society, 1987. pp. 272-275.

[2]

Annaratone, M. Amould, E. Gross. T. Kung, H. T., Lam, M., Menzilcioglu. 0. and Webb, J. A. The Warp Computer: Architecture, Implementation, and Performance. IEEE Transactions on Computers C-36.12 (December 1987). 1523-1538.

Digital Library

[3]

Amould. E. A., Bitz, F. J., Cooper. E. C., Kung. H. T., Sansom, R. D. and Steenkiste, P. A. The Design of Nectar A Network Backplane for Heterogeneous Multicomputers. Roceedings of Third International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLGS El). ACM, April, 1989. pp. 205216.

Digital Library

[4]

Athas, W. C. and Seitz, C. L. Multicomputers: Message- Passing Concurrent Computers. Compufer 21.8 (August 1988). 9-24.

Digital Library

[5]

Borlcar, S., Cohn, R., Cox. G., Gleason. S. Gross, T., Ktmg, H. T., Lam, M., Moore, B., Peterson, C., Pieper, J., Rankin, L., Tseng, P. S., Sutton, J. Urban&i. J. and Webb, J. iWarp: An Integrated Solution to High-Speed Parallel Computing. Proceedings of Supercomputing '88, IEEE Computer Society and ACM SIGARCH, Orlando, Florida, November, 1988,pp.330-339.

Digital Library

[6]

Cohn, R. Gross, T. Lam, M. and Tseng, P. S. Architecture and Compiler Tradeoffs for a Long Instruction Word Microprocessor. Roccedings of Third International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS Ill), ACM, April, 1989, pp. 2-14.

Digital Library

[7]

Dally, William J. A VUZArchitecture for Concurrent Data Structures. Kluwer Academic Publishers. 1987.

Digital Library

[8]

Dally, W. I, and Seitz. C. L. The Torus Routing Chip. Distributed Computing 1.4 (1986), 187-196.

[9]

Gross, T. Communication in iWarp Systems. Proceedings of Supercomputing '89. November. 1989. pp. 436 - 445.

Digital Library

[10]

Hamey, L. G. C., Webb, J. A., and Wu, I. C. An Architecture Independent Programming Language for Lcw- Level Vision. Computer V&wt, Graphics, and Image Processing 48 (1989). 246-264.

Digital Library

[11]

Hamey. L. G. C., Webb, I. A., and Wu, I. C. Low-level Vision on Warp and the Apply Rogrammlng Model. In Parallel Computation and Computers for Artifkcial Intelligence. Kluwer Academic Publishers, 1987. pp. 185-199. Edited by J. Kowahlc.

[12]

Kung. H. T. Systolic Communication. Proceedings of the International Conference on Systolic Arrays, San Diego, California, May, 1988, pp. 695-703.

[13]

Kung. H. T. "Deadlock Avoidance for Systolic Communication". JorvM1 of Complexity 4.2 (June 1988), 87-105. (A revised version also appears in Conference Proceedings of the 15th Annual International Symposium on Computer Architecture, June 1988, pp. 252-260).

Digital Library

[14]

Kung, H. T. Network-Based Multicomputers: Redefming High Performance Computing in the 1990s. Roceedings of Decennial Caltech Conference on VLSI, MlT Press, Pasadena, California, March, 1989, pp. 49-66.

Digital Library

[15]

Lam, M. A Systolic Array Optimizing Compiler. Ph.D. Th., Carnegie Mellon University, May 1987. The thesis is published by Kluwer Academic Publishers. Boston. Massachusetts, 1988.

Digital Library

[16]

Meruilcioglu. 0. Kung. H. T. and Song, S. W. Comprehensive Evaluation of a Two-Dimensional Configurable Array. Proceediigs of the Nineteenth International Symposium on Fault-Tolerant Computing, 1989, pp. 93-100.

[17]

Seitz. C. L., Athas, W. C. Flaig, C. M., Martin, A. J., Seizovic, J., Steele, C. S. and Su. W-K. The Architecture and Rogramming of the Ametek Series 2010 Multicomputer. The Third Confererence on Hypercube Concurrent Computers and Applications., Pasadena, California, January, 1988, pp. 33.36.

Digital Library

[18]

Tseng. P. S. A Parallelking Compiler for Distributed Memory Parallel Computers. Ph.D. Th., Carnegie Mellon University, May 1989.

Digital Library

Cited By

Fan WHe JHan ZLi PWang R(2020) Reconfigurable Fault‐tolerance mapping of ternary N ‐cubes onto chips Concurrency and Computation: Practice and Experience10.1002/cpe.565932:11Online publication date: 7-Jan-2020
https://doi.org/10.1002/cpe.5659
Hetland CTziantzioulis GSuchy BLeonard MHan JAlbers JHardavellas NDinda PWeissman JButt ASmirni E(2019)Paths to Fast Barrier Synchronization on the NodeProceedings of the 28th International Symposium on High-Performance Parallel and Distributed Computing10.1145/3307681.3325402(109-120)Online publication date: 17-Jun-2019
https://dl.acm.org/doi/10.1145/3307681.3325402
Hu YKung S(2018)Systolic ArraysHandbook of Signal Processing Systems10.1007/978-3-319-91734-4_26(939-977)Online publication date: 14-Oct-2018
https://doi.org/10.1007/978-3-319-91734-4_26
Show More Cited By

Index Terms

Supporting systolic and memory communication in iWarp
1. Hardware
  1. Hardware validation

Recommendations

Supporting systolic and memory communication in iWarp
Special Issue: Proceedings of the 17th annual international symposium on Computer Architecture

iWarp is a parallel architecture developed jointly by Carnegie Mellon University and Intel Corporation. The iWarp communication system supports two widely used interprocessor communication styles: memory communication and systolic communication. This ...
Communication in iWarp systems
Supercomputing '89: Proceedings of the 1989 ACM/IEEE conference on Supercomputing

The iWarp processor is a building block for parallel systems and is developed in a joint project by Carnegie Mellon University and Intel Corporation. The iWarp processor integrates computation and communication: the iWarp component architecture consists ...
The Impact of Communication Style on Machine Resource Usage for the iWarp Parallel Processor

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ISCA '90: Proceedings of the 17th annual international symposium on Computer Architecture

May 1990

378 pages

ISBN:0897913663

DOI:10.1145/325164

ACM SIGARCH Computer Architecture News Volume 18, Issue 2SI
Special Issue: Proceedings of the 17th annual international symposium on Computer Architecture
June 1990
356 pages
ISSN:0163-5964
DOI:10.1145/325096
Chairmen:
Jean-Loup Baer,
Larry Snyder,
James Goodman
Issue’s Table of Contents

Copyright © 1990 Authors.

Sponsors

SIGARCH: ACM Special Interest Group on Computer Architecture
IEEE-CS: Computer Society

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 May 1990

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Article

Conference

ISCA90

Sponsor:

SIGARCH
IEEE-CS

ISCA90: International Symposium on Computer Architecture

May 28 - 31, 1990

Washington, Seattle, USA

Acceptance Rates

Overall Acceptance Rate 543 of 3,203 submissions, 17%

Upcoming Conference

ISCA '25

Sponsor:
sigarch

The 52nd Annual International Symposium on Computer Architecture

June 21 - 25, 2025

Tokyo , Japan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

126
Total Citations
View Citations
578
Total Downloads

Downloads (Last 12 months)58
Downloads (Last 6 weeks)7

Reflects downloads up to 11 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Fan WHe JHan ZLi PWang R(2020) Reconfigurable Fault‐tolerance mapping of ternary N ‐cubes onto chips Concurrency and Computation: Practice and Experience10.1002/cpe.565932:11Online publication date: 7-Jan-2020
https://doi.org/10.1002/cpe.5659
Hetland CTziantzioulis GSuchy BLeonard MHan JAlbers JHardavellas NDinda PWeissman JButt ASmirni E(2019)Paths to Fast Barrier Synchronization on the NodeProceedings of the 28th International Symposium on High-Performance Parallel and Distributed Computing10.1145/3307681.3325402(109-120)Online publication date: 17-Jun-2019
https://dl.acm.org/doi/10.1145/3307681.3325402
Hu YKung S(2018)Systolic ArraysHandbook of Signal Processing Systems10.1007/978-3-319-91734-4_26(939-977)Online publication date: 14-Oct-2018
https://doi.org/10.1007/978-3-319-91734-4_26
Brodowicz Sterling (2017)Simultac FontonSupercomputing Frontiers and Innovations: an International Journal10.14529/jsfi1702034:2(27-37)Online publication date: 15-Jun-2017
https://dl.acm.org/doi/10.14529/jsfi170203
(2016)Virtualized I/OAttaining High Performance Communications10.1201/b10249-17(261-282)Online publication date: 19-Apr-2016
https://doi.org/10.1201/b10249-17
Hu YKung S(2013)Systolic ArraysHandbook of Signal Processing Systems10.1007/978-1-4614-6859-2_34(1111-1143)Online publication date: 10-May-2013
https://doi.org/10.1007/978-1-4614-6859-2_34
Hu YKung S(2010)Systolic ArraysHandbook of Signal Processing Systems10.1007/978-1-4419-6345-1_29(817-849)Online publication date: 16-Jul-2010
https://doi.org/10.1007/978-1-4419-6345-1_29
Solomatnikov AFiroozshahian AShacham OAsgar ZWachs MQadeer WRichardson SHorowitz MAlbonesi DMartonosi MAugust DMartínez J(2009)Using a configurable processor generator for computer architecture prototypingProceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture10.1145/1669112.1669159(358-369)Online publication date: 12-Dec-2009
https://dl.acm.org/doi/10.1145/1669112.1669159
Turner YTamir Y(2007)Deadlock-free connection-based adaptive routing with dynamic virtual circuitsJournal of Parallel and Distributed Computing10.1016/j.jpdc.2006.08.01267:1(13-32)Online publication date: 1-Jan-2007
https://dl.acm.org/doi/10.1016/j.jpdc.2006.08.012
White RRuthven IJose JRijsbergen C(2005)Evaluating implicit feedback models using searcher simulationsACM Transactions on Information Systems10.1145/1080343.108034723:3(325-361)Online publication date: 1-Jul-2005
https://dl.acm.org/doi/10.1145/1080343.1080347
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

Affiliations

Shekhar Borkar

School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania and Intel Corporation, CO4-01, 5200 N.E. Elam Young Pkwy, Hillsboro, Oregon

Robert Cohn

School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania and Intel Corporation, CO4-01, 5200 N.E. Elam Young Pkwy, Hillsboro, Oregon

George Cox

School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania and Intel Corporation, CO4-01, 5200 N.E. Elam Young Pkwy, Hillsboro, Oregon

Thomas Gross

School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania and Intel Corporation, CO4-01, 5200 N.E. Elam Young Pkwy, Hillsboro, Oregon

H. T. Kung

School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania and Intel Corporation, CO4-01, 5200 N.E. Elam Young Pkwy, Hillsboro, Oregon

Monica Lam

School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania and Intel Corporation, CO4-01, 5200 N.E. Elam Young Pkwy, Hillsboro, Oregon

Margie Levine

School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania and Intel Corporation, CO4-01, 5200 N.E. Elam Young Pkwy, Hillsboro, Oregon

Brian Moore

School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania and Intel Corporation, CO4-01, 5200 N.E. Elam Young Pkwy, Hillsboro, Oregon

Wire Moore

Craig Peterson

School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania and Intel Corporation, CO4-01, 5200 N.E. Elam Young Pkwy, Hillsboro, Oregon

Jim Susman

School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania and Intel Corporation, CO4-01, 5200 N.E. Elam Young Pkwy, Hillsboro, Oregon

Jim Sutton

School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania and Intel Corporation, CO4-01, 5200 N.E. Elam Young Pkwy, Hillsboro, Oregon

John Urbanski

School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania and Intel Corporation, CO4-01, 5200 N.E. Elam Young Pkwy, Hillsboro, Oregon

Jon Webb

School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania and Intel Corporation, CO4-01, 5200 N.E. Elam Young Pkwy, Hillsboro, Oregon

View Table of Contents