Proceedings of the 16th annual international symposium on Computer architecture

ISCA '89: Proceedings of the 16th annual international symposium on Computer architecture

April 1989

1989 Proceeding

Chairman:
Jean-Claude Syre

Publisher:

Association for Computing Machinery
New York
NY
United States

Conference:

Jerusalem Israel

ISBN:

978-0-89791-319-5

Published:

01 April 1989

Sponsors:

SIGARCH, IEEE-CS

Get Alerts for this ConferenceAlerts Save to BinderBinder

Save to Binder

Create a New Binder

Name

Export CitationCitation

Share on

Next Conference

ISCA '25

Sponsor:
sigarch

The 52nd Annual International Symposium on Computer Architecture

June 21 - 25, 2025

Tokyo , Japan

ISCA '25 website

Reflects downloads up to 25 Dec 2024Bibliometrics

Citation Count

1,460

Downloads (6 weeks)

739

Downloads (12 months)

5,036

Downloads (cumulative)

29,765

Sections

ISCA '89: Proceedings of the 16th annual international symposium on Computer architecture

1989

Previous Next

Abstract

No abstract available.

Select All

Export Citations Save to Binder

Article

Free

Evaluating the performance of four snooping cache coherency protocols

S. J. Eggers,
R. H. Katz

Pages 2–15https://doi.org/10.1145/74925.74927

Write-invalidate and write-broadcast coherency protocols have been criticized for being unable to achieve good bus performance across all cache configurations. In particular, write-invalidate performance can suffer as block size increases; and large ...

- 65
- 1,206
Metrics
Total Citations65
Total Downloads1,206
Last 12 Months137
Last 6 weeks17

Abstract
View online with eReader
PDF

Article

Free

Multi-level shared caching techniques for scalability in VMP-M/C

D. R. Cheriton,
H. A. Goosen,
P. D. Boyle

Pages 16–24https://doi.org/10.1145/74925.74928

The problem of building a scalable shared memory multiprocessor can be reduced to that of building a scalable memory hierarchy, assuming interprocessor communication is handled by the memory system. In this paper, we describe the VMP-MC design, a ...

- 47
- 478
Metrics
Total Citations47
Total Downloads478
Last 12 Months84
Last 6 weeks10

Abstract
View online with eReader
PDF

Article

Free

Design and performance of a coherent cache for parallel logic programming architectures

A. Goto,
A. Matsumoto,
E. Tick

Pages 25–33https://doi.org/10.1145/74925.74929

This paper describes the design and performance of a tightly-coupled shared-memory coherent cache optimized for the execution of parallel logic programming architectures. The cache utilizes a copy-back write-allocation protocol having five states and a ...

- 3
- 511
Metrics
Total Citations3
Total Downloads511
Last 12 Months81
Last 6 weeks9

Abstract
View online with eReader
PDF

Article

Free

The Epsilon dataflow processor

V. G. Grafe,
G. S. Davidson,
J. E. Hoch,
V. P. Holmes

Pages 36–45https://doi.org/10.1145/74925.74930

The εpsilon dataflow architecture is designed for high speed uniprocessor execution as well as for parallel operation in a multiprocessor system. The εpsilon architecture directly matches ready operands, thus eliminating the need for associative ...

- 44
- 655
Metrics
Total Citations44
Total Downloads655
Last 12 Months124
Last 6 weeks15

Abstract
View online with eReader
PDF

Article

Free

An architecture of a dataflow single chip processor

S. Sakai,
y. Yamaguchi,
K. Hiraki,
Y. Kodama,
T. Yuba

Pages 46–53https://doi.org/10.1145/74925.74931

A highly parallel (more than a thousand) dataflow machine EM-4 is now under development. The EM-4 design principle is to construct a high performance computer using a compact architecture by overcoming several defects of dataflow machines. Constructing ...

- 150
- 892
Metrics
Total Citations150
Total Downloads892
Last 12 Months149
Last 6 weeks30

Abstract
View online with eReader
PDF

Article

Free

Exploiting data parallelism in signal processing on a dataflow machine

P. Nitezki

Pages 54–61https://doi.org/10.1145/74925.74932

This paper will show that the massive data parallelism inherent to most signal processing tasks may be easily mapped onto the parallel structure of a data flow machine. A special system called STRUCTFLOW has been designed to optimize the static data ...

- 0
- 428
Metrics
Total Citations0
Total Downloads428
Last 12 Months79
Last 6 weeks5

Abstract
View online with eReader
PDF

Article

Free

Architectural mechanisms to support sparse vector processing

R. N. Ibbett,
T. M. Hopkins,
K. I. M. McKinnon

Pages 64–71https://doi.org/10.1145/74925.74933

We discuss the algorithmic steps involved in common sparse matrix problems, with particular emphasis on linear programming by the revised simplex method. We then propose new architectural mechanisms which are being built into an experimental machine, ...

- 6
- 465
Metrics
Total Citations6
Total Downloads465
Last 12 Months99
Last 6 weeks4

Abstract
View online with eReader
PDF

Article

Free

A dynamic storage scheme for conflict-free vector access

D. T. Harper,
D. A. Linebarger

Pages 72–77https://doi.org/10.1145/74925.74934

Previous investigations into data storage schemes have focused on finding a storage scheme that permits conflict-free access for a set of frequently encountered access patterns. This paper considers an alternative approach. Rather than forcing a single ...

- 6
- 414
Metrics
Total Citations6
Total Downloads414
Last 12 Months77
Last 6 weeks10

Abstract
View online with eReader
PDF

Article

Free

SIMP (Single Instruction stream/Multiple instruction Pipelining): a novel high-speed single-processor architecture

K. Murakami,
N. Irie,
S. Tomita

Pages 78–85https://doi.org/10.1145/74925.74935

SIMP is a novel multiple instruction-pipeline parallel architecture. It is targeted for enhancing the performance of SISD processors drastically by exploiting both temporal and spatial parallelisms, and for keeping program compatibility as well. Degree ...

- 64
- 1,175
Metrics
Total Citations64
Total Downloads1,175
Last 12 Months142
Last 6 weeks32

Abstract
View online with eReader
PDF

Article

Free

2-D SIMD algorithms in the perfect shuffle networks

Y. Ben-Asher,
D. Egozi,
A. Schuster

Pages 88–95https://doi.org/10.1145/74925.74936

This paper studies a set of basic algorithms for SIMD Perfect Shuffle networks. These algorithms where studied in several papers, but for the 1-D case, where the size of the problem N is the same as the number of processors P. For the 2-D case of N = L *...

- 0
- 511
Metrics
Total Citations0
Total Downloads511
Last 12 Months95
Last 6 weeks7

Abstract
View online with eReader
PDF

Article

Free

Systematic hardware adaptation of systolic algorithms

M. Valero-Garcia,
J. J. Navarro,
J. M. Llaberia,
M. Valero

Pages 96–104https://doi.org/10.1145/74925.74937

In this paper we propose a methodology to adapt Systolic Algorithms to the hardware selected for their implementation. Systolic Algorithms obtained can be efficiently implemented using Pipelined Functional Units. The methodology is based on two ...

- 4
- 504
Metrics
Total Citations4
Total Downloads504
Last 12 Months76
Last 6 weeks12

Abstract
View online with eReader
PDF

Article

Free

Task migration in hypercube multiprocessors

M.-S. Chen,
K. G. Shin

Pages 105–111https://doi.org/10.1145/74925.74938

Allocation and deallocation of subcubes usually result in a fragmented hypercube where even if a sufficient number of hypercube nodes are available, they do not form a subcube large enough to execute an incoming task. As the fragmentation in ...

- 10
- 586
Metrics
Total Citations10
Total Downloads586
Last 12 Months66
Last 6 weeks6

Abstract
View online with eReader
PDF

Article

Free

Characteristics of performance-optimal multi-level cache hierarchies

S. Przybylski,
M. Horowitz,
J. Hennessy

Pages 114–121https://doi.org/10.1145/74925.74939

The increasing speed of new generation processors will exacerbate the already large difference between CPU cycle times and main memory access times. As this difference grows, it will be increasingly difficult to build single-level caches that are both ...

- 58
- 1,955
Metrics
Total Citations58
Total Downloads1,955
Last 12 Months318
Last 6 weeks21

Abstract
View online with eReader
PDF

Article

Free

Supporting reference and dirty bits in SPUR's virtual address cache

D. A. Wood,
R. H. Katz

Pages 122–130https://doi.org/10.1145/74925.74940

Virtual address caches can provide faster access times than physical address caches, because translation is only required on cache misses. However, because we don't check the translation information on each cache access, maintaining reference and dirty ...

- 7
- 754
Metrics
Total Citations7
Total Downloads754
Last 12 Months224
Last 6 weeks22

Abstract
View online with eReader
PDF

Article

Free

Inexpensive implementations of set-associativity

R. E. Kessler,
R. Jooss,
A. Lebeck,
M. D. Hill

Pages 131–139https://doi.org/10.1145/74925.74941

The traditional approach to implementing wide set-associativity is expensive, requiring a wide tag memory (directory) and many comparators. Here we examine alternative implementations of associativity that use hardware similar to that used to implement ...

- 70
- 726
Metrics
Total Citations70
Total Downloads726
Last 12 Months125
Last 6 weeks19

Abstract
View online with eReader
PDF

Article

Free

Organization and performance of a two-level virtual-real cache hierarchy

W. H. Wang,
J.-L. Baer,
H. M. Levy

Pages 140–148https://doi.org/10.1145/74925.74942

We propose and analyze a two-level cache organization that provides high memory bandwidth. The first-level cache is accessed directly by virtual addresses. It is small, fast, and, without the burden of address translation, can easily be optimized to ...

- 65
- 1,601
Metrics
Total Citations65
Total Downloads1,601
Last 12 Months137
Last 6 weeks23

Abstract
View online with eReader
PDF

Article

Free

High performance communications in processor networks

C. R. Jesshope,
P. R. Miller,
J. T. Yantchev

Pages 150–157https://doi.org/10.1145/74925.74943

In order to provide an arbitrary and fully dynamic connectivity in a network of processors, transport mechanisms must be implemented, which provide the propagation of data from processor to processor, based on addresses contained within a packet of ...

- 43
- 601
Metrics
Total Citations43
Total Downloads601
Last 12 Months132
Last 6 weeks20

Abstract
View online with eReader
PDF

Article

Free

Introducing memory into the switch elements of multiprocessor interconnection networks

H. E. Mizrahi,
J. L. Baer,
E. D. Lazowska,
J. Zahorjan

Pages 158–166https://doi.org/10.1145/74925.74944

As VLSI technology continues to improve, circuit area is gradually being replaced by pin restrictions as the limiting factor in design. Thus, it is reasonable to anticipate that on-chip memory will become increasingly inexpensive since it is a simple, ...

- 23
- 422
Metrics
Total Citations23
Total Downloads422
Last 12 Months79
Last 6 weeks12

Abstract
View online with eReader
PDF

Article

Free

Using feedback to control tree saturation in multistage interconnection networks

S. L. Scott,
G. S. Sohi

Pages 167–176https://doi.org/10.1145/74925.74945

In this paper, we propose the use of feedback schemes in multiprocessors which use an interconnection network with distributed routing control. We show that by altering system behavior so as to minimize the occurrence of a performance-degrading ...

- 19
- 385
Metrics
Total Citations19
Total Downloads385
Last 12 Months55
Last 6 weeks7

Abstract
View online with eReader
PDF

Article

Free

Constructing replicated systems using processors with point-to-point communication links

P. D. Ezhilchelvan,
S. K. Shrivastava,
A. Tully

Pages 177–184https://doi.org/10.1145/74925.74946

Replicated processing with majority voting is a well known method of achieving fault tolerance. We consider the problem of constructing a distributed system composed of an arbitrarily large number of N-modular redundant (NMR) nodes, where each node ...

- 2
- 378
Metrics
Total Citations2
Total Downloads378
Last 12 Months41
Last 6 weeks4

Abstract
View online with eReader
PDF

Article

Free

KCM: a knowledge crunching machine

H. Benker,
J. M. Beacco,
M. Dorochevsky,
Th. Jeffré,
A. Pöhlmann,
J. Noyé,
B. Poterie,
J. C. Syre,
O. Thibault,
G. Watzlawik

Pages 186–194https://doi.org/10.1145/74925.74947

KCM (Knowledge Crunching Machine) is a high-performance back-end processor which, coupled to a UNIX^* desk-top workstation, provides a powerful and user-friendly Prolog environment catering for both development and execution of significant Prolog ...

- 8
- 665
Metrics
Total Citations8
Total Downloads665
Last 12 Months218
Last 6 weeks54

Abstract
View online with eReader
PDF

Article

Free

A high performance Prolog processor with multiple function units

A. Singhal,
Y. N. Patt

Pages 195–202https://doi.org/10.1145/74925.74948

We describe the Parallel Unification Machine (PLUM), a Prolog processor that exploits fine grain parallelism using multiple function units executing in parallel. In most cases the execution of bookkeeping instructions is almost completely overlapped by ...

- 2
- 460
Metrics
Total Citations2
Total Downloads460
Last 12 Months94
Last 6 weeks22

Abstract
View online with eReader
PDF

Article

Free

Evaluation of memory system for integrated Prolog processor IPP

M. Morioka,
S. Yamaguchi,
T. Bandoh

Pages 203–210https://doi.org/10.1145/74925.74949

This paper discusses an optimal memory system to realize a high performance integrated Prolog processor, the IPP. First, the memory access characteristics of Prolog are analyzed by a simulator, which simulates the execution of a Prolog program at a ...

- 1
- 326
Metrics
Total Citations1
Total Downloads326
Last 12 Months57
Last 6 weeks7

Abstract
View online with eReader
PDF

Article

Free

A type driven hardware engine for Prolog clause retrieval over a large knowledge base

K.-F. Wong,
M. H. Williams

Pages 211–222https://doi.org/10.1145/74925.74950

Whereas existing Prolog systems are very effective at handling small knowledge bases, they are not very efficient at and often incapable of handling large sets of clauses. Large knowledge bases which may comprise millions of clauses and are shared by a ...

- 3
- 400
Metrics
Total Citations3
Total Downloads400
Last 12 Months102
Last 6 weeks16

Abstract
View online with eReader
PDF

Article

Free

Comparing software and hardware schemes for reducing the cost of branches

W. W. Hwu,
T. M. Conte,
P. P. Chang

Pages 224–233https://doi.org/10.1145/74925.74951

Pipelining has become a common technique to increase throughput of the instruction fetch, instruction decode, and instruction execution portions of modern computers. Branch instructions disrupt the flow of instructions through the pipeline, increasing ...

- 71
- 639
Metrics
Total Citations71
Total Downloads639
Last 12 Months135
Last 6 weeks21

Abstract
View online with eReader
PDF

Article

Free

Improving performance of small on-chip instruction caches

M. K. Farrens,
a. R. Pleszkun

Pages 234–241https://doi.org/10.1145/74925.74952

Most current single-chip processors employ an on-chip instruction cache to improve performance. A miss in this instruction cache will cause an external memory reference which must compete with data references for access to the external memory, thus ...

- 22
- 499
Metrics
Total Citations22
Total Downloads499
Last 12 Months94
Last 6 weeks8

Abstract
View online with eReader
PDF

Article

Free

Achieving high instruction cache performance with an optimizing compiler

W. W. Hwu,
P. P. Chang

Pages 242–251https://doi.org/10.1145/74925.74953

Increasing the execution power requires a high instruction issue bandwidth, and decreasing instruction encoding and applying some code improving techniques cause code expansion. Therefore, the instruction memory hierarchy performance has become an ...

- 183
- 1,168
Metrics
Total Citations183
Total Downloads1,168
Last 12 Months138
Last 6 weeks20

Abstract
View online with eReader
PDF

Article

Free

The impact of code density on instruction cache performance

P. Steenkiste

Pages 252–259https://doi.org/10.1145/74925.74954

The widespread use of reduced-instruction-set computers has generated a lot of interest in the tradeoff between the density of an instruction set and the size of the instruction cache. In this paper we present and justify a method that predicts the ...

- 23
- 579
Metrics
Total Citations23
Total Downloads579
Last 12 Months110
Last 6 weeks12

Abstract
View online with eReader
PDF

Article

Free

Can dataflow subsume von Neumann computing?

R. S. Nikhil

Pages 262–272https://doi.org/10.1145/74925.74955

We explore the question: “What can a von Neumann processor borrow from dataflow to make it more suitable for a multiprocessor?” Starting with a simple, “RISC-like” instruction set, we show how to change the underlying processor organization to make it ...

- 115
- 1,345
Metrics
Total Citations115
Total Downloads1,345
Last 12 Months202
Last 6 weeks33

Abstract
View online with eReader
PDF

Article

Free

Exploring the benefits of multiple hardware contexts in a multiprocessor architecture: preliminary results

W.-D. Weber,
A. Gupta

Pages 273–280https://doi.org/10.1145/74925.74956

A fundamental problem that any scalable multiprocessor must address is the ability to tolerate high latency memory operations. This paper explores the extent to which multiple hardware contexts per processor can help to mitigate the negative effects of ...

- 112
- 1,005
Metrics
Total Citations112
Total Downloads1,005
Last 12 Months214
Last 6 weeks52

Abstract
View online with eReader
PDF

Save to Binder

Create a New Binder

Name

Contributors

Jean Claude Syre
- Publication Years1978 - 1989
- Publication counts4
- Citation count28
- Available for Download3
- Downloads (cumulative)1,553
- Downloads (12 months)486
- Downloads (6 weeks)116
- Average Downloads per Article518
- Average Citation per Article7
View Full Profile

Index Terms

Proceedings of the 16th annual international symposium on Computer architecture
1. Computer systems organization
  1. Architectures
    1. Parallel architectures

Comments

Recommendations

CompSysTech '15: Proceedings of the 16th International Conference on Computer Systems and Technologies
CSL-LICS '14: Proceedings of the Joint Meeting of the Twenty-Third EACSL Annual Conference on Computer Science Logic (CSL) and the Twenty-Ninth Annual ACM/IEEE Symposium on Logic in Computer Science (LICS)
LICS '20: Proceedings of the 35th Annual ACM/IEEE Symposium on Logic in Computer Science

Acceptance Rates

Overall Acceptance Rate 543 of 3,203 submissions, 17%

Year	Submitted	Accepted	Rate
ISCA '22	400	67	17%
ISCA '19	365	62	17%
ISCA '17	322	54	17%
ISCA '13	288	56	19%
ISCA '12	262	47	18%
ISCA '08	259	37	14%
ISCA '06	234	31	13%
ISCA '05	194	45	23%
ISCA '04	217	31	14%
ISCA '03	184	36	20%
ISCA '02	180	27	15%
ISCA '01	163	24	15%
ISCA '99	135	26	19%
Overall	3,203	543	17%

Export Citations

Select Citation format

Please download or close your previous search result export first before starting a new bulk export.
Preview is not available.
By clicking download,a status dialog will open to start the export process. The process may takea few minutes but once it finishes a file will be downloadable from your browser. You may continue to browse the DL while the export process is in progress.
Download
- Download citation
- Copy citation

Save to Binder

Sections

Save to Binder

Index Terms

Recommendations

CompSysTech '15: Proceedings of the 16th International Conference on Computer Systems and Technologies

CSL-LICS '14: Proceedings of the Joint Meeting of the Twenty-Third EACSL Annual Conference on Computer Science Logic (CSL) and the Twenty-Ninth Annual ACM/IEEE Symposium on Logic in Computer Science (LICS)

LICS '20: Proceedings of the 35th Annual ACM/IEEE Symposium on Logic in Computer Science

Acceptance Rates