Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

Memory-system design considerations for dynamically-scheduled processors

Published: 01 May 1997 Publication History

Abstract

In this paper, we identify performance trends and design relationships between the following components of the data memory hierarchy in a dynamically-scheduled processor: the register file, the lockup-free data cache, the stream buffers, and the interface between these components and the lower levels of the memory hierarchy. Similar performance was obtained from all systems having support for fewer than four in-flight misses, irrespective of the register-file size, the issue width of the processor, and the memory bandwidth. While providing support for more than four in-flight misses did increase system performance, the improvement was less than that obtained by increasing the number of registers. The addition of stream buffers to the investigated systems led to a significant performance increase, with the larger increases for systems having less in-flight-miss support, greater memory bandwidth, or more instruction issue capability. The performance of these systems was not significantly affected by the inclusion of traffic filters, dynamic-stride calculators, or the inclusion of the per-load non-unity stride-predictor and the incremental-prefetching techniques, which we introduce. However, the incremental prefetching technique reduces the bandwidth consumed by stream buffers by 50% without a significant impact on performance.

References

[1]
Norm Jouppi. Improving Direct Mapped Cache Performance by the Addition of a Small Fully Associative Cache and Prefetch Buffers. Technical Report TN-15, Digital E, quipment Corporation Western Research Lab, March 1990.
[2]
Keith I. Farkas. Memory-system Design Considerations for Dynamically-scheduled Microprocessors. PhD thesis, Department of Electrical and Computer Engineering, University of Toronto, Ontario, Canada, January 1997. (URL: http:llwww.eeeg.toronto.edul,farkaslthesis.phd.html).
[3]
David Kroft. Lookup-Free Instruction Fetch/Prefetch Cache Organization. In the Proceedings of the Eighth International Symposium on Computer Architecture, pages 81-87, May 1981.
[4]
Keith I. Farkas and Norman P. Jouppi. Complexity/Performance Tradeoffs with Non-Blocking Loads. In the Proceedings of the 21st International Symposium on Computer Architecture, pages 211-222, 1994.
[5]
Kenneth C. Yeager. The MIPS R10000 Superscalar Microprocessor. IEEE Micro, 16(2):28--40, 1996.
[6]
Linley Gwermap. PA-8000 Combines Complexity and Speed. Microprocessor Reports, 8(15):1,6-11, 1994.
[7]
S. Peter Song, Marvin Denman, and Joe Chang. The PowerPC 604 RISC Microprocessor. IEEE Micro, 14(5):8-17, 1994.
[8]
IBM Microelectronics. PowerPC 620 RISC Microprocessor Technical Summary, 10 1994. document number: MPR620TSU-01.
[9]
Keith I. Farkas, Norman P. Jouppi, and Paul Chow. How Useful Are Non-blocking Loads, Stream Buffers and Speculative Execution in Multiple Issue Processors? In the Proceedings of the First International Symposium on High Performance Computer Architecture, pages 78--89, 1995.
[10]
Subbarao Palacharla and 1L E. Kessler. Evaluating Stream Buffers as a Secondary Cache Replacement. In the Proceedings of the 21st International Symposium on Computer Architecture, pages 24-33, 1994.
[11]
John W. Fu, janak H. Patel, and Bob L. Janssens. Stride Directed Prefetehing in Scalar Processors. In the Proceedings of the 25th Annual international Symposium on Microarchitecture, pages 102-110, 1992.
[12]
Tien-Fu Chert and Jean-Loup Baer. Reducing Memory Latency via Non-blocking and Prefetching Caches. In the Proceedings of the Fifth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 51--61, October 1992.
[13]
Scott McFarling. Combining Branch Predictors. Digital Equipment Corporation Western Research Lab Technical Note TN-36, 1993.
[14]
Amitabh Srivastava and Alan Eustace. Atom: A system for building customized program analysis tools. In the Proceedings of the ACM SIGPLAN '94 Conference on Programming Languages, March 1994.
[15]
Keith I. Farkas, Paul Chow, Norman P. Jouppi, and Zvonko Vranesie. Memory-system Design Considerations for Dynamically-scheduled Processors. Technical Report 1, Digital Equipment Corporation Western Research Lab, 1997. (URL: http://www.research.digital.eom/wrl/teehreports).

Cited By

View all
  • (2022)Fine-grained address segmentation for attention-based variable-degree prefetchingProceedings of the 19th ACM International Conference on Computing Frontiers10.1145/3528416.3530236(103-112)Online publication date: 17-May-2022
  • (2019)Temporal Prefetching Without the Off-Chip MetadataProceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3352460.3358300(996-1008)Online publication date: 12-Oct-2019
  • (2019)FUSE: Fusing STT-MRAM into GPUs to Alleviate Off-Chip Memory Access Overheads2019 IEEE International Symposium on High Performance Computer Architecture (HPCA)10.1109/HPCA.2019.00055(426-439)Online publication date: Feb-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGARCH Computer Architecture News
ACM SIGARCH Computer Architecture News  Volume 25, Issue 2
Special Issue: Proceedings of the 24th annual international symposium on Computer architecture (ISCA '97)
May 1997
349 pages
ISSN:0163-5964
DOI:10.1145/384286
Issue’s Table of Contents
  • cover image ACM Conferences
    ISCA '97: Proceedings of the 24th annual international symposium on Computer architecture
    June 1997
    350 pages
    ISBN:0897919017
    DOI:10.1145/264107

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 May 1997
Published in SIGARCH Volume 25, Issue 2

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)89
  • Downloads (Last 6 weeks)16
Reflects downloads up to 16 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Fine-grained address segmentation for attention-based variable-degree prefetchingProceedings of the 19th ACM International Conference on Computing Frontiers10.1145/3528416.3530236(103-112)Online publication date: 17-May-2022
  • (2019)Temporal Prefetching Without the Off-Chip MetadataProceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3352460.3358300(996-1008)Online publication date: 12-Oct-2019
  • (2019)FUSE: Fusing STT-MRAM into GPUs to Alleviate Off-Chip Memory Access Overheads2019 IEEE International Symposium on High Performance Computer Architecture (HPCA)10.1109/HPCA.2019.00055(426-439)Online publication date: Feb-2019
  • (2009)Energy-Efficient Pre-Execution Techniques in Two-Step Physical Register DeallocationIEICE Transactions on Information and Systems10.1587/transinf.E92.D.2186E92-D:11(2186-2195)Online publication date: 2009
  • (2006)Program Counter-Based Prediction Techniques for Dynamic Power ManagementIEEE Transactions on Computers10.1109/TC.2006.8755:6(641-658)Online publication date: 1-Jun-2006
  • (2003)A Decoupled Predictor-Directed Stream Prefetching ArchitectureIEEE Transactions on Computers10.1109/TC.2003.118394352:3(260-276)Online publication date: 1-Mar-2003
  • (2000)Exploiting Parallelism in Geometry Processing with General Purpose Processors and Floating-Point SIMD InstructionsIEEE Transactions on Computers10.1109/12.86932449:9(934-946)Online publication date: 1-Sep-2000
  • (2022)Practical Temporal Prefetching With Compressed On-Chip MetadataIEEE Transactions on Computers10.1109/TC.2021.306590971:11(2858-2871)Online publication date: 1-Nov-2022
  • (2021)Matryoshka: A Coalesced Delta Sequence PrefetcherProceedings of the 50th International Conference on Parallel Processing10.1145/3472456.3473510(1-11)Online publication date: 9-Aug-2021
  • (2021)A hierarchical neural model of data prefetchingProceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3445814.3446752(861-873)Online publication date: 19-Apr-2021
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media