Article

Free access

Empirical study of latency hiding on a fine-grain parallel processor

Authors:

Toshio Shimada, and

Satoshi SekiguchiAuthors Info & Claims

ICS '93: Proceedings of the 7th international conference on Supercomputing

August 1993

Pages 220 - 229

https://doi.org/10.1145/165939.165972

Published: 01 August 1993 Publication History

Abstract

Latency associated with memory accesses and process communications are one of the most difficult obstacles in constructing a practical massively parallel system. So far, two approaches to hide latencies have been proposed. They are prefetching and multi-threading. An instruction-level data-driven computer is an ideal test-bed for evaluating these latency hiding methods because prefetching and multi-threading are naturally implemented in an instruction-level data-driven computer as unfolding and concurrent execution of multiple contexts. This paper evaluates latency hiding methods on SIGMA-1, a dataflow supercomputer developed in Electrotechnical Laboratory. As a result of evaluation, these methods are effective to hide static latencies but not effective to hide dynamic latencies. Also, concurrent execution of multiple contexts is more effective than prefetching.

References

[1]

Archbold, J. and Baer, J.-L., " Cache Coherence Protocols: evaluation U~ng a Multiprocessor Simulation Model," ACM Trans. Computer Systems, Vol.4, No. 4, pp. 273-298, 1986.

Digital Library

[2]

Sweazey, P. and Smith, A.J., "A Class of Compatb ble Cache Con~stency Protocols and Their Support by IEEE Futurehu~" Proc. l gth Int. ~ymp. on Computer Architecture, pp. 414-423, 1986.

Digital Library

[3]

Weber, W.-D. and Gupta, A., "Exploring the Benefits of Multiple Hardware Contexts in a Mul~processor Architecture: Prdiminary Results," Proc. 16th Int. Symp. on Computer Architecture, pp. 273-280, 1989.

Digital Library

[4]

Gupta, A., Hennessy, J., Gharachofloo, K., Mowr% T. and Weber, W.-D., "Compara~ve Evaluation of Latency Reducing and Tolerating Techniques," Proc. 18th Int. Symp. on Computer Arch~ecture, pp. 254-263, 1991.

Digital Library

[5]

Boothe, B. and Ranade, A., "Improved Mulfithreading Techniques for Hiding Commun~ation Latency in Multiprocessors," Proc. 19th Int. Symp. Computer A~ chitecture, pp. 214-223, 1992.

Digital Library

[6]

Arvind and Iannucd, R.A., "A Critique of Multiproces~ng yon Neumann Style," Proc. 10th Int. Symp. Computer Arch~ecture, pp. 426-436, 1983.

Digital Library

[7]

Iannucd, R.A., "Toward a datafiow/von Neumann hybrid architecture," Proc. 15th Int. Symp. Computer Architecture, pp. 131-140, 1988.

Digital Library

[8]

Hiraki, K., Shimada, T. and N~hida, K., "A Hardware Design of the SIGMA-1 - A Data Flow Computer for S~enOfic Computations," Proc. Int. ConL Paralld Processing, IEEE, pp. 851-855, 1984.

[9]

Sakai, S., Yamaguchi, Y., Hiraki, K., Kodama, Y. and Yuba, T., " An Arch~ecture of a Dataflow Single Chip Processor," Proc. 16th Int. Symp. Computer Arch~ecture, pp. 46-53, 1989.

Digital Library

[10]

Arvind and Thomas, R.E.,"~Structure: An Effective Data Structure for Functional Languages~ MIT, LCS- TM178, Lab. for Computer S~cnce, MIT, 1978.

[11]

Sekiguchi,S., Shimada,T., and Hir~ki,K., "Sequential Description and Paral~l Execu~on Language DFGII for Dataflow Supercomputers," 1991 Internafion~ Conference on Supercomputng, ACM, Cologne, June, pp. 57-66.

Digital Library

[12]

Gurd, J., Kirkham, C. C. and Watson, I., "The Manchester Prototype e Dataflow Computer," Commun. ACM, Vol. 28, No. 1, 1985.

Digital Library

[13]

Hiraki, K., Sekiguchi, S. and Shimada, T., "Load Scheduling Mechanism Using Inter-PE Network," Trans. of IEGE Japan, (in Japanese), Vol. J69-D, No. 2, pp. 180-189, 1986.

[14]

Shimada, T., Hiraki, K. and Sekiguchi, S., " Performance evalua~on of the dataflow computer SIGMA-1," Proc. JSPP92, pp. 345-352, 1992.

[15]

D~ly, W., "A Universal Paral~l Computer Architecture," Proc. FGCS92, pp. 746-757, Tokyo, 1992.

Digital Library

[16]

Shimada, T., Sekiguchi, S. and Hiraki, K., "A dataflow language D FC," Trans. of IECE japan, Vol. J71-D, No.3, 1988.

[17]

Sake. S., Hiraki, K., Yamaguchi, Y., Kodama, Y. and Yuba, T., " Pipeline Optimization of a Data-Flow Machine," in Advanced Topics in Data-flow Computing, Prentice H~I, 1991.

Index Terms

Recommendations

Fine Grain Cache Partitioning Using Per-Instruction Working Blocks
PACT '15: Proceedings of the 2015 International Conference on Parallel Architecture and Compilation (PACT)

A traditional least-recently used (LRU) cache replacement policy fails to achieve the performance of the optimal replacement policy when cache blocks with diverse reuse characteristics interfere with each other. When multiple applications share a cache, ...
Read More
Empirical study of parallel trace-driven LRU cache simulators
PADS '95: Proceedings of the ninth workshop on Parallel and distributed simulation

This paper reports on the performance of four parallel algorithms for simulating an associative cache operating under the LRU (Least-Recently-Used) replacement policy. Three of the algorithms are implemented on the MasPar MP-2. Another algorithm is a ...
Read More
Improving support for locality and fine-grain sharing in chip multiprocessors
PACT '08: Proceedings of the 17th international conference on Parallel architectures and compilation techniques

Both commercial and scientific workloads benefit from concurrency and exhibit data sharing across threads/processes. The resulting sharing patterns are often fine-grain, with the modified cache lines still residing in the writer's primary cache when ...
Read More

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICS '93: Proceedings of the 7th international conference on Supercomputing

August 1993

425 pages

ISBN:089791600X

DOI:10.1145/165939

Chairman:
Yoichi Muraoka

Copyright © 1993 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGARCH: ACM Special Interest Group on Computer Architecture

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 August 1993

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Article

Conference

ICS93

Sponsor:

SIGARCH

ICS93: International Conference on Supercomputing

July 19 - 23, 1993

Tokyo, Japan

Acceptance Rates

Overall Acceptance Rate 629 of 2,180 submissions, 29%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
237
Total Downloads

Downloads (Last 12 months)17
Downloads (Last 6 weeks)2

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents