Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2830772.2830831acmconferencesArticle/Chapter ViewAbstractPublication PagesmicroConference Proceedingsconference-collections
research-article

The inner most loop iteration counter: a new dimension in branch history

Published: 05 December 2015 Publication History

Abstract

The most efficient branch predictors proposed in academic literature exploit both global branch history and local branch history. However, local history branch predictor components introduce major design challenges, particularly for the management of speculative histories. Therefore, most effective hardware designs use only global history components and very limited forms of local histories such as a loop predictor.
The wormhole (WH) branch predictor was recently introduced to exploit branch outcome correlation in multidimensional loops. For some branches encapsulated in a multidimensional loop, their outcomes are correlated with those of the same branch in neighbor iterations, but in the previous outer loop iteration. Unfortunately, the practical implementation of the WH predictor is even more challenging than the implementation of local history predictors.
In this paper, we introduce practical predictor components to exploit this branch outcome correlation in multidimensional loops: the IMLI-based predictor components. The iteration index of the inner most loop in an application can be efficiently monitored at instruction fetch time using the Inner Most Loop Iteration (IMLI) counter. The outcomes of some branches are strongly correlated with the value of this IMLI counter. A single PC+IMLI counter indexed table, the IMLI-SIC table, added to a neural component of any recent predictor (TAGE-based or perceptron-inspired) captures this correlation. Moreover, using the IMLI counter, one can efficiently manage the very long local histories of branches that are targeted by the WH predictor. A second IMLI-based component, IMLI-OH, allows for tracking the same set of hard-to-predict branches as WH.
Managing the speculative states of the IMLI-based predictor components is quite simple. Our experiments show that augmenting a state-of-the-art global history predictor with IMLI components outperforms previous state-of-the-art academic predictors leveraging local and global history at much lower hardware complexity (i.e., smaller storage budget, smaller number of tables and simpler management of speculative states).

References

[1]
D. Jiménez and C. Lin, "Dynamic branch prediction with perceptrons," in Proceedings of the Seventh International Symposium on High Performance Computer Architecture, 2001.
[2]
D. Jimenez, "Fast path-based neural branch prediction," in Proceedings of the 36th Annual IEEE/ACM International Symposium on Microarchitecture, dec 2003.
[3]
D. Jiménez, "Piecewise linear branch prediction," in Proceedings of the 32nd Annual International Symposium on Computer Architecture, june 2005.
[4]
D. Tarjan and K. Skadron, "Merging path and gshare indexing in perceptron branch prediction," TACO, vol. 2, no. 3, pp. 280--300, 2005.
[5]
R. S. Amant, D. A. Jiménez, and D. Burger, "Low-power, high-performance analog neural branch prediction," in MICRO, pp. 447--458, 2008.
[6]
Y. Ishii, "Fused two-level branch prediction with ahead calculation," Journal of Instruction Level Parallelism (http://wwwjilp.org/vol9), May 2007.
[7]
Y. Ishii, K. Kuroyanagi, T. Sawada, M. Inaba, and K. Hiraki, "Revisiting local history for improving fused two-level branch predictor," in Proceedings of the 3rd Championship on Branch Prediction, http://www.jilp.org/jwac-2/, 2011.
[8]
A. Seznec, "Analysis of the O-GEHL branch predictor," in Proceedings of the 32nd Annual International Symposium on Computer Architecture, june 2005.
[9]
A. Seznec and P. Michaud, "A case for (partially)-tagged geometric history length predictors," Journal of Instruction Level Parallelism (http://www.jilp.org/vol8), April 2006.
[10]
A. Seznec, "A new case for the tage branch predictor," in Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO-44, (New York, NY, USA), pp. 117--127, ACM, 2011.
[11]
A. Seznec, "Tage-sc-l branch predictors," in Proceedings of the 4th Championship on Branch Prediction, http://www.jilp.org/cbp2014/, 2014.
[12]
T.-Y. Yeh and Y. Patt, "Two-level adaptive branch prediction," in Proceedings of the 24th International Symposium on Microarchitecture, Nov. 1991.
[13]
M. Evers, S. Patel, R. Chappell, and Y. Patt, "An analysis of correlation and predictability: What makes two-level branch predictors work," in Proceedings of the 25nd Annual International Symposium on Computer Architecture, June 1998.
[14]
D. Morris, M. Poplingher, T. Yeh, M. Corwin, and W. Chen, "Method and apparatus for predicting loop exit branches," June 27 2002. US Patent App. 09/169,866.
[15]
T. Sherwood and B. Calder, "Loop termination prediction," in High Performance Computing, Third International Symposium, ISHPC 2000, Tokyo, Japan, October 16-18, 2000. Proceedings, pp. 73--87, 2000.
[16]
J. Albericio, J. San Miguel, N. Enright Jerger, and A. Moshovos, "Wormhole: Wisely predicting multidimensional branches," in Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO-47, (Washington, DC, USA), pp. 509--520, IEEE Computer Society, 2014.
[17]
J. Albericio, J. San Miguel, N. Enright Jerger, and A. Moshovos, "Wormhole branch prediction using multidimensional histories," in Proceedings of the 4th Championship on Branch Prediction, http://www.jilp.org/cbp2014/, 2014.
[18]
A. Seznec, S. Felix, V. Krishnan, and Y. Sazeidès, "Design tradeoffs for the ev8 branch predictor," in Proceedings of the 29th Annual International Symposium on Computer Architecture, 2002.
[19]
J. Smith, "A study of branch prediction strategies," in Proceedings of the 8th Annual International Symposium on Computer Architecture, 1981.
[20]
S. Pan, K. So, and J. Rahmeh, "Improving the accuracy of dynamic branch prediction using branch correlation," in Proceedings of the 5th International Conference on Architectural Support for Programming Languages and Operating Systems, 1992.
[21]
R. Nair, "Dynamic path-based branch correlation," in Proceedings of the 28th Annual International Symposium on Microarchitecture, 1995.
[22]
S. McFarling, "Combining branch predictors," TN 36, DEC WRL, June 1993.
[23]
P. Michaud, A. Seznec, and R. Uhlig, "Trading conflict and capacity aliasing in conditional branch predictors," in Proceedings of the 24th Annual International Symposium on Computer Architecture (ISCA-97), June 1997.
[24]
D. Jimenéz and C. Lin, "Neural methods for dynamic branch prediction," ACM Transactions on Computer Systems, vol. 20, Nov. 2002.
[25]
P. Michaud, "A PPM-like, tag-based predictor," Journal of Instruction Level Parallelism (http://www.jilp.org/vol7), April 2005.
[26]
A. Seznec, "A 64 kbytes ISL-TAGE branch predictor," in Proceedings of the 3rd Championship Branch Prediction, June 2011.
[27]
D. Jiménez, "Reconsidering complex branch predictors," in Proceedings of the 9th International Symposium on High Performance Computer Architecture, 2003.
[28]
E. Hao, P.-Y. Chang, and Y. N. Patt, "The effect of speculatively updating branch history on branch prediction accuracy, revisited," in Proceedings of the 27th Annual International Symposium on Microarchitecture, (San Jose, California), 1994.
[29]
W. W. Hwu and Y. N. Patt, "Checkpoint repair for out-of-order execution machines," in Proceedings of the 14th Annual International Symposium on Computer Architecture, ISCA '87, (New York, NY, USA), pp. 18--26, ACM, 1987.
[30]
A. Fog, "The microarchitecture of intel, amd and via cpus, an optimization guide for assembly programmers and compiler makers," 2014.

Cited By

View all
  • (2024)Alternate Path μ-op Cache Prefetching2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00092(1230-1245)Online publication date: 29-Jun-2024
  • (2024)Effective Context-Sensitive Memory Dependence Prediction2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00045(515-527)Online publication date: 2-Mar-2024
  • (2023)MBPlib: Modular Branch Prediction Library2023 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)10.1109/ISPASS57527.2023.00016(71-80)Online publication date: Apr-2023
  • Show More Cited By
  1. The inner most loop iteration counter: a new dimension in branch history

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MICRO-48: Proceedings of the 48th International Symposium on Microarchitecture
    December 2015
    787 pages
    ISBN:9781450340342
    DOI:10.1145/2830772
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 05 December 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Research-article

    Funding Sources

    • Discovery grant
    • Natural Sciences and Engineering Research Council of Canada
    • European Research Council
    • Bell Graduate Scholarship

    Conference

    MICRO-48
    Sponsor:

    Acceptance Rates

    MICRO-48 Paper Acceptance Rate 61 of 283 submissions, 22%;
    Overall Acceptance Rate 484 of 2,242 submissions, 22%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)18
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 17 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Alternate Path μ-op Cache Prefetching2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00092(1230-1245)Online publication date: 29-Jun-2024
    • (2024)Effective Context-Sensitive Memory Dependence Prediction2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00045(515-527)Online publication date: 2-Mar-2024
    • (2023)MBPlib: Modular Branch Prediction Library2023 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)10.1109/ISPASS57527.2023.00016(71-80)Online publication date: Apr-2023
    • (2022)HAIR: Halving the Area of the Integer Register File with Odd/Even BankingACM Transactions on Architecture and Code Optimization10.1145/354483819:4(1-25)Online publication date: 16-Sep-2022
    • (2020)BranchNet: A Convolutional Neural Network to Predict Hard-To-Predict Branches2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO50266.2020.00022(118-130)Online publication date: Oct-2020
    • (2019)Towards the adoption of Local Branch Predictors in Modern Out-of-Order Superscalar ProcessorsProceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3352460.3358315(519-530)Online publication date: 12-Oct-2019
    • (2019)Branch Prediction Is Not A Solved Problem: Measurements, Opportunities, and Future Directions2019 IEEE International Symposium on Workload Characterization (IISWC)10.1109/IISWC47752.2019.9042108(228-238)Online publication date: Nov-2019
    • (2019)Bingo Spatial Data Prefetcher2019 IEEE International Symposium on High Performance Computer Architecture (HPCA)10.1109/HPCA.2019.00053(399-411)Online publication date: Feb-2019
    • (2018)Cost effective speculation with the omnipredictorProceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques10.1145/3243176.3243208(1-13)Online publication date: 1-Nov-2018
    • (2018)A survey of techniques for dynamic branch predictionConcurrency and Computation: Practice and Experience10.1002/cpe.466631:1Online publication date: 2-Sep-2018
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media