Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/320080.320085acmconferencesArticle/Chapter ViewAbstractPublication PagesmicroConference Proceedingsconference-collections
Article
Free access

Fetch directed instruction prefetching

Published: 16 November 1999 Publication History

Abstract

Instruction supply is a crucial component of processor performance. Instruction prefetching has been proposed as a mechanism to help reduce instruction cache misses, which in turn can help increase instruction supply to the processor. In this paper we examine a new instruction prefetch architecture called Fetch Directed Prefetching, and compare it to the performance of next-line prefetching and streaming buffers. This architecture uses a decoupled branch predictor and instruction cache, so the branch predictor can run ahead of the instruction cache fetch. In addition, we examine marking fetch blocks in the branch predictor that are kicked out of the instruction cache, so branch predicted fetch blocks can be accurately prefetched. Finally, we model the use of idle instruction cache ports to filter prefetch requests, thereby saving bus bandwidth to the L2 cache.

References

[1]
D.C. Burger and T. M. Austin. The simplescalar tool set, version 2.0. Technical Report CS-TR-97-1342, University of Wisconsin, Madison, June 1997.
[2]
I.K. Chen, C.C. Lee, and TN. Mudge. Instruction prefetching using branch prediction information. In International Conference on Computer Design, pages 593--601, October 1997.
[3]
T.F. Chen and J,L. Baer. Reducing memory latency via non-blocking and prefetching caches. In Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-IV), pages 5 !-61, October 1992.
[4]
K. Farkas and N. Jouppi. Complexity/performance tradeoffs with non-blocking loads, in 21st Annual International Symposium on Computer Architecture, pages 211-222, April 1994.
[5]
K.I. Farkas, N.P. Jouppi, and P. Chow. How useful are non-blocking loads, stream buffers and speculative execution in multiple issue processors? In Proceedings of the First International Symposium on High-Performance Computer Architecture, January 1995.
[6]
D. Joseph and D. Grunwald. Prefetching using markov predictors. In 24th Annual International Symposium on Computer Architecture, June 1997.
[7]
N. Jouppi. Improving direct-mapped cache performance by the addition of a small fully associative cache and prefetch buffers. In Proceedings of the 17th Annual International Symposium on Computer Architecture, May 1990.
[8]
D. Kroft. Lockup-free instruction fetch/prefetch cache organization. In 8th Annual international Symposium of Computer Architecture, pages 81-87, May 198 I.
[9]
C.-K. Luk and T. C. Mowry. Cooperative prefetching: Compiler and hardware support for effective instruction prefetching in modem processors. In 31st International Symposium on Microarchitecture, December 1998.
[10]
S. McFarling. Procedure merging with instruction caches. Proceedings of the ACM SIGPLAN '91 Conference on Programming Language Design and Implementation, 26(6): 71-79, June 1991.
[11]
S. McFarling. Combining branch predictors. Technical Report TN- 36, Digital Equipment Corporation, Western Research Lab, June 1993.
[12]
K. Pettis and R. C. Hansen. Profile guided code positioning. Proceedings of the ACM SIGPLAN '90 Conference on Pn~gramming Language Design and Implementation, 25(6): 16-27, June 1990.
[13]
J. Pierce and T. Mudge, Wrong-path instruction prefetching. In 29th International Symposium on Microarchitecture, pages 165-175, December 1996.
[14]
G. Reinman, T. Austin, and B. Calder. A scalable front-end architecture for fast instruction delivery, in 26th Annual International Symposium on Computer Architecture, May 1999.
[15]
A. Seznec, S. Jourdan, P. Sainrat, and P. Michaud. Multiple-block ahead branch predictors. In Proceedings of the Seventh International Conference on Architectural Support fi~r Programming Languages and Operating Systems, pages 116-127, October 1996.
[16]
T Sherwood and B. Calder. The time varying behavior of programs. Technical Report UCSD-CS99-630, University of California, San Diego, August 1999.
[17]
A.J. Smith. Cache memories. Computing Surveys, 14(3):473-530, September 1982.
[18]
J.E. Smith and W.-C. Hsu. Prefetching in supercomputer instruction caches. In Proceedings of Supercomputing, November 1992.
[19]
J. Stark, P. Racunas, and Y. Putt. Reducing the performance impact of instruction cache misses by writing instructions into the reservation stations out-of-order. In Proceedings of the 30th International Symposium on Microarchitecture, pages 34-45, December 1997.
[20]
A. Veidenbaum, Q. Zhao, and A. Shameer. Non-sequential instruction cache prefetching for multiple-issue processors. International journal of High-Speed Computing, 10(1):115-140, 1999.
[21]
C. Xia and J. Torrellas. Instruction prefetching of systems codes with layout optimized for reduced cache misses. In 23rdAnnual International Symposium on Computer Architecture, June 1996.
[22]
T. Yeh. Two-level adpative branch prediction and instruction fetch mechanisms for high performance superscalar processors. Ph.D. Dissertation, University of Michigan, 1993.

Cited By

View all
  • (2024)PDIP: Priority Directed Instruction PrefetchingProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640394(846-861)Online publication date: 27-Apr-2024
  • (2024)Wrong-Path-Aware Entangling Instruction PrefetcherIEEE Transactions on Computers10.1109/TC.2023.333730873:2(548-559)Online publication date: 1-Feb-2024
  • (2022)Just-In-Time Compilation on ARM—A Closer Look at Call-Site Code ConsistencyACM Transactions on Architecture and Code Optimization10.1145/354656819:4(1-23)Online publication date: 16-Sep-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MICRO 32: Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
November 1999
299 pages
ISBN:076950437X

Sponsors

Publisher

IEEE Computer Society

United States

Publication History

Published: 16 November 1999

Check for updates

Qualifiers

  • Article

Conference

MICRO99
Sponsor:

Acceptance Rates

MICRO 32 Paper Acceptance Rate 27 of 131 submissions, 21%;
Overall Acceptance Rate 484 of 2,242 submissions, 22%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)270
  • Downloads (Last 6 weeks)45
Reflects downloads up to 25 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)PDIP: Priority Directed Instruction PrefetchingProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640394(846-861)Online publication date: 27-Apr-2024
  • (2024)Wrong-Path-Aware Entangling Instruction PrefetcherIEEE Transactions on Computers10.1109/TC.2023.333730873:2(548-559)Online publication date: 1-Feb-2024
  • (2022)Just-In-Time Compilation on ARM—A Closer Look at Call-Site Code ConsistencyACM Transactions on Architecture and Code Optimization10.1145/354656819:4(1-23)Online publication date: 16-Sep-2022
  • (2022)CRISP: critical slice prefetchingProceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3503222.3507745(300-313)Online publication date: 28-Feb-2022
  • (2022)Shooting Down the Server Front-End BottleneckACM Transactions on Computer Systems10.1145/348449238:3-4(1-30)Online publication date: 4-Jan-2022
  • (2022)ThermometerProceedings of the 49th Annual International Symposium on Computer Architecture10.1145/3470496.3527430(742-756)Online publication date: 18-Jun-2022
  • (2022)OCOLOS: Online COde Layout OptimizationSProceedings of the 55th Annual IEEE/ACM International Symposium on Microarchitecture10.1109/MICRO56248.2022.00045(530-545)Online publication date: 1-Oct-2022
  • (2022)Whisper: Profile-Guided Branch Misprediction Elimination for Data Center ApplicationsProceedings of the 55th Annual IEEE/ACM International Symposium on Microarchitecture10.1109/MICRO56248.2022.00017(19-34)Online publication date: 1-Oct-2022
  • (2021)Twig: Profile-Guided BTB Prefetching for Data Center ApplicationsMICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3466752.3480124(816-829)Online publication date: 18-Oct-2021
  • (2021)PDede: Partitioned, Deduplicated, Delta Branch Target BufferMICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3466752.3480046(779-791)Online publication date: 18-Oct-2021
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media