Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

The Misprediction Recovery Cache

Published: 01 August 1998 Publication History

Abstract

In modern processors, deep pipelines couple with superscalar techniques to allow each pipe stage to process multiple instructions. When such a pipe must be flushed and refilled, as when predicted program flow beyond a branch is subsequently recognized as wrong, the temporary performance loss is significant. While modern branch target buffer (BTB) technology makes this flush/refill penalty fairly rare, the penalty that accrues from the remaining branch mispredictions is a serious impediment to even higher processor performance. Advanced mechanisms that can reduce this residual misprediction penalty can be of enormous value in future microprocessor designs. In this paper we describe the design and performance of a promising new mechanism called the Misprediction Recovery Cache (MRC). The key results of our study are. (1) Small, finite sized MRCs (16 to 256 entry) can effectively reduce branch penalty in deeply pipelined processors. (2) Commercial Benchmarks such as the Winstone benchmarks make better use of larger M RCs due to large number of unique branch instructions unlike the predominantly technical SPECint benchmarks. (3) The MRC hit rates increase with increasing BTB prediction accuracy (5-200% depending on MRC size) due to fewer residual mispredictions associated with better prediction. (4) For the processor architecture we studied, the M RC resulted in up to 20% improvement in cpi(cycles per instruction). (5) The incremental performance gain achievable by adding an MRC to a modern CISC processor (which uses a BTB with a two-level predictor) is two to three times of what was achievable by going from a one-level predictor to a two-level predictor.

References

[1]
1. A. K. Nanda, J. O. Bondi, and S. Dutta, Misprediction Recovery Cache (MRC): Concept, analysis, and design, Technical Paper, Texas Instruments, pp. 1-30 (June 1996).
[2]
2. J. O. Bondi, A. K. Nanda and S. Dutta, Integrating a misprediction recovery cache into a superscalar pipeline, Proc. Micro-29 (December 1996).
[3]
3. J. O. Bondi, S. Dutta, and A. K. Nanda, Pipelined microprocessor with branch misprediction cache circuits, systems, and methods. Patent application TI-22458 (June 1996).
[4]
4. J. E. Smith, A study of branch prediction strategies, Proc. ISCA, pp. 135-148 (May 1981).
[5]
5. J. Lee and A. J. Smith, Branch prediction strategies and branch target buffer design, Computer , pp. 6-22 (January 1984).
[6]
6. Intel Corporation, Pentium Processor User's Manual (1993).
[7]
7. IBM Corporation, PowerPC 604 (1995).
[8]
8. T.-Y. Yeh and Y. N. Patt, Alternative Implementations of two-level adaptive branch prediction, Proc. ISCA, pp. 124-134 (1992).
[9]
9. S. T. Pan, K. So, and J. T. Rahmeh, Improving the accuracy of dynamic branch prediction using branch correlation, Proc. ASPLOS-V, pp. 76-84 (October 1992).
[10]
10. Intel Corporation, Pentium-Pro Processor Family Developers' Manual (1995).
[11]
11. The Ziff-Davis Benchmark Operation (ZDBOp) Web site, http://www.zdnet.com/zdbop/.
[12]
12. G. S. Tyson, The effects of predicated execution on branch prediction, Proc. Micro-27, pp. 196-206 (November 1994).
[13]
13. S. A. Mahlke, R. E. Hank, R. A. Bringmann, J. C. Gyllenhaal, D. M. Gallagher, W. W. Hwu, Characterizing the impact of predicated execution on branch prediction, Proc. Micro-27, pp. 217-227 (November 1994).
[14]
14. W. W. Hwu et al., Compiler technology for future microprocessors, Proc. of the IEEE, pp. 1625-1640 (December 1995).
[15]
15. P.-Y. Chang, Eric Hao, Tse-Yu Yeh, and Yale Patt, Branch classification: A new mechanism for improving branch predictor performance, Proc. Micro-27, pp. 22-31.
[16]
16. D. R. Ditzel and H. R. McLellan, Branch folding in the CRISP microprocessor: Reducing branch delay to zero, Proc. ISCA, pp. 2-9 (May 1987).
[17]
17. M. Franklin and M. Smotherman, A fill-unit approach to multiple instruction issue, Proc. Micro-27, pp. 162-171 (November 1994).

Recommendations

Comments

Information & Contributors

Information

Published In

cover image International Journal of Parallel Programming
International Journal of Parallel Programming  Volume 26, Issue 4
August 1998
189 pages

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 August 1998

Author Tags

  1. BRANCH MISPREDICTION
  2. CACHE
  3. CISC
  4. PIPELINE
  5. THREAD

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media