research-article

Scalable coherent interface

Authors:

David V. James,

Anthony T. Laundrie,

Stein Gjessing,

Gurindar SohiAuthors Info & Claims

Computer, Volume 23, Issue 6

Pages 74 - 77

https://doi.org/10.1109/2.55503

Published: 01 June 1990 Publication History

Abstract

The scalable coherent interface (SCI), a local or extended computer backplane interface being defined by an IEEE standard project (P1596), is discussed. the interconnection is scalable, meaning that up to 64 K processor, memory, or I/O nodes can effectively interface to a shared SCI interconnection. The SCI sharing-list structures are described, and sharing-list addition and removal are examined. Optimizations being considered to improve the performance of large system configurations are discussed. Request combining, a useful feature of linked-list coherence, is described. SCI's optional extensions, including synchronization using a queued-on-lock bit, are considered

References

[1]

1. G.F. Pfister et al., "The IBM Research Parallel Processor Prototype (RP3): Introduction and Architecture," Proc. Int'l Conf. Parallel Processing, Computer Society Press, Los Alamitos, Calif., Order No. 637 (microfiche only), 1985, pp. 764-771.

Google Scholar

[2]

2. J.R. Goodman, M.K. Vernon, and P.J. Woest, "Efficient Synchronization Primitives for Large-Scale Cache-Coherent Multiprocessors," Proc. ASPLOS III, Computer Society Press, Los Alamitos, Calif., Order No. 1936, 1989, pp. 64-75.

Crossref

Google Scholar

Cited By

View all

Mahmoudi RAkil MBedoui M(2017)Concurrent computation of topological watershed on shared memory parallel machinesParallel Computing10.1016/j.parco.2017.08.01069:C(78-97)Online publication date: 1-Nov-2017
https://dl.acm.org/doi/10.1016/j.parco.2017.08.010
Fernández-Pascual RRos AAcacio M(2017)To be silent or notThe Journal of Supercomputing10.1007/s11227-017-2026-673:10(4428-4443)Online publication date: 1-Oct-2017
https://dl.acm.org/doi/10.1007/s11227-017-2026-6
Fernández-Pascual RRos AAcacio M(2016)Are distributed sharing codes a solution to the scalability problem of coherence directories in manycores? An evaluation studyThe Journal of Supercomputing10.1007/s11227-015-1596-472:2(612-638)Online publication date: 1-Feb-2016
https://dl.acm.org/doi/10.1007/s11227-015-1596-4
Show More Cited By

Index Terms

Scalable coherent interface

Recommendations

A User-level Multicast Performance Comparison of Scalable Coherent Interface and Myrinet Interconnects
LCN '03: Proceedings of the 28th Annual IEEE International Conference on Local Computer Networks

This paper compares and evaluates the multicastperformance of two of the most widely deployed System-AreaNetworks (SANs), Dolphin's Scalable Coherent Interface (SCI)and Myricom's Myrinet. Both networks deliver low latency andhigh bandwidth to ...
SCILab - A Simulation Environment for the Scalable Coherent Interface
MASCOTS '95: Proceedings of the 3rd International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems

The IEEE Std. 1596-1992 for the Scalable Coherent Interface (SCI) specifies a topology-independent communication protocol with the possibility of connecting up to 64 K nodes. SCILab is a collection of tools to simulate the behavior of SCI based ...
On the performance and reliability of fault-tolerant scalable coherent interface networks

Reviews

Reviewer: Ned Chapin

Thakkar, Dubois, Laundrie, and Sohi The authors briefly survey shared-memory multiprocessor hardware architectures, emphasizing the current main directions of research. They do not discuss distributed multiprocessor architectures such as NCube or iPSC. For shared-memory architectures, the authors mention network switching-based architectures (such as BBN's Butterfly) and bus-based architectures (such as Sequent's Symmetry). They say very little about network switching-based architectures, however, and instead focus on directory-based and bus-based schemes for maintaining coherence (providing for the integrity of data shared among the processors during computation). After quickly reviewing four coherence properties that are incorporated in most protocols for maintaining coherence in shared-memory architectures, the authors summarize the use of presence flags, B pointers, and linked lists as bases for alternative protocols. As an example of the linked list approach, they mention the IEEE Scalable Coherent Interface project. Protocols for maintaining coherence become more complex and voluminous as the number of processors (and their associated ports and cache memories) increases. The reason for being concerned with coherence is to attempt to avoid a more-than-proportional increase in the complexity and volume of the protocol as the number of processors increases (the “scale” relationship). Another approach to seeking a favorable scale relationship is to modify the hardware for the processor connections. In this area, the authors review bus-based schemes, emphasizing multiple-bus and hierarchical-bus systems. They briefly mention various proposals for differing topologies and roles for the processor connections for enabling access to shared memory. While the authors profess to have had a lot of help in preparing this survey, the result is not well-balanced. The purpose was to provide a context for three subsequent short papers, one on an example of a bus-based scheme (the Aquarius multiple-bus multiprocessor architecture) and two on the linked-list variety of directory-based schemes (the SCI at the Universities of Oslo and Wisconsin and the SDD protocol at Stanford University). The context could have been better set by leaving fewer loose ends, by being more consistent in the use of terminology, and by being more direct about the complexity supposedly being mitigated. From the terminology and the references, the multiprocessor hardware and protocol people are clearly not talking with the software database people. While significant parallels exist in the situations and problems they face, as well as in the general character of the resources they can marshal, each group seems to be trying to proceed as though the other had little to offer. I see very capable people in both groups, but they are not in touch with each other. James, Laundrie, Gjessing, and Sohi The aim of the Scalable Coherent Interface (SCI), IEEE standards project P1596, is to define an extended computer backplane enabling access to a shared memory, scalable up to 64K nodes with a transfer rate of one gigabyte per second per node. Nodes may be processors, memories, or input-output ports in any mix. The approach taken thus far is to use a distributed directory; linked lists; cache memory; point-to-point unidirectional connections for the communication of packets; and techniques emphasizing reliability, fault recovery, and optimization for high-frequency transactions. The definition work is being done by simulation, with participation by a group at the University of Oslo and a group at the University of Wisconsin. The SCI-P1596 chair is David B. Gustavson of Stanford University. The bulk of the paper discusses some of the list handling done for common anticipated situations. The discussion of how the proposed list handling differs from the usual bidirectionally linked list handling for queues and stacks seems weak. The bibliography is disappointingly skimpy. Thapar and Delagi The authors report on their work on a distributed-directory scheme for shared-memory multiprocessors. Singly-linked lists are the fundamental data structures used to help provide coherence in the access to shared data. The authors use most of the paper to describe basic list operations performed on the distributed queues; they also contrast their work with the Wisconsin-Oslo SCI work. While this work appears to be more complex than the SCI work, the authors also apparently assume fewer restrictions on the hardware configuration. While they offer some words of contrast, I would have liked to read how they see their list operations as differing from the usual and what they see as the tradeoffs on coherence for their proposed protocols. C arlton and Despain In the Aquarius scalable multiple-bus shared-memory hardware architecture, each node has access to two or more buses arranged in a multidimensional array and serving as a network. Access to the network is provided only for nodes; each node has memory, a cache, and a processor. Part of the memory is used for a portion of a distributed directory to provide coherence in processing shared data. Shared data are held in cache, except at the “root node” for the data. The root node can have private (unshared) data. Nodes can share data most quickly when they are on the same bus. Cache states and directory states are distributed, with each node showing the states only for the data it has. The authors give a clear summary of their proposed “multi-multi” architecture and protocol, including a few rough quantitative measures of scalability. They give no feel for the tradeoffs and compromises, and the list of references is helpful only on history. The authors only briefly discuss how they visualize the protocol working for widely shared data, a point I would have liked to read more about.

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Information & Contributors

Information

Published In

Computer Volume 23, Issue 6

June 1990

118 pages

ISSN:0018-9162

Editor:
Bruce D. Shriver
Univ. of Southwestern Louisana, Lafayette

Issue’s Table of Contents

Publisher

IEEE Computer Society Press

Washington, DC, United States

Publication History

Published: 01 June 1990

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

42
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 09 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Mahmoudi RAkil MBedoui M(2017)Concurrent computation of topological watershed on shared memory parallel machinesParallel Computing10.1016/j.parco.2017.08.01069:C(78-97)Online publication date: 1-Nov-2017
https://dl.acm.org/doi/10.1016/j.parco.2017.08.010
Fernández-Pascual RRos AAcacio M(2017)To be silent or notThe Journal of Supercomputing10.1007/s11227-017-2026-673:10(4428-4443)Online publication date: 1-Oct-2017
https://dl.acm.org/doi/10.1007/s11227-017-2026-6
Fernández-Pascual RRos AAcacio M(2016)Are distributed sharing codes a solution to the scalability problem of coherence directories in manycores? An evaluation studyThe Journal of Supercomputing10.1007/s11227-015-1596-472:2(612-638)Online publication date: 1-Feb-2016
https://dl.acm.org/doi/10.1007/s11227-015-1596-4
Fernández-Pascual RRos AAcacio M(2016)Optimization of a Linked Cache Coherence Protocol for Scalable Manycore CoherenceProceedings of the 29th International Conference on Architecture of Computing Systems -- ARCS 2016 - Volume 963710.1007/978-3-319-30695-7_8(100-112)Online publication date: 4-Apr-2016
https://dl.acm.org/doi/10.1007/978-3-319-30695-7_8
Fang LLiu PHu QHuang MJiang GFensch CO'Boyle MSeznec ABodin F(2013)Building expressive, area-efficient coherence directoriesProceedings of the 22nd international conference on Parallel architectures and compilation techniques10.5555/2523721.2523762(299-308)Online publication date: 7-Oct-2013
https://dl.acm.org/doi/10.5555/2523721.2523762
Attiya HGramoli VMilani A(2010)A provably starvation-free distributed directory protocolProceedings of the 12th international conference on Stabilization, safety, and security of distributed systems10.5555/1926829.1926864(405-419)Online publication date: 20-Sep-2010
https://dl.acm.org/doi/10.5555/1926829.1926864
Barrow-Williams NFensch CMoore SSalapura VGschwind MKnoop J(2010)Proximity coherence for chip multiprocessorsProceedings of the 19th international conference on Parallel architectures and compilation techniques10.1145/1854273.1854293(123-134)Online publication date: 11-Sep-2010
https://dl.acm.org/doi/10.1145/1854273.1854293
Kunz RHorowitz MBerger EChen B(2008)The case for simple, visible cache coherencyProceedings of the 2008 ACM SIGPLAN workshop on Memory systems performance and correctness: held in conjunction with the Thirteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '08)10.1145/1353522.1353532(31-35)Online publication date: 2-Mar-2008
https://dl.acm.org/doi/10.1145/1353522.1353532
de Dios ASahelices BIbáñez PViñals VLlabería J(2006)Speeding-up synchronizations in DSM multiprocessorsProceedings of the 12th international conference on Parallel Processing10.1007/11823285_49(473-484)Online publication date: 28-Aug-2006
https://dl.acm.org/doi/10.1007/11823285_49
Van Der Steen A(2003)An Evaluation of Some Beowulf ClustersCluster Computing10.1023/A:10257918082836:4(287-297)Online publication date: 1-Oct-2003
https://dl.acm.org/doi/10.1023/A%3A1025791808283
Show More Cited By

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Abstract

References

Cited By

Index Terms

Recommendations

A User-level Multicast Performance Comparison of Scalable Coherent Interface and Myrinet Interconnects

SCILab - A Simulation Environment for the Scalable Coherent Interface

On the performance and reliability of fault-tolerant scalable coherent interface networks

Reviews

Access critical reviews of Computing literature here

Comments

Information

Published In

Publisher

Publication History

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations