Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/264107.264189acmconferencesArticle/Chapter ViewAbstractPublication PagesiscaConference Proceedingsconference-collections
Article
Free access

Dynamic speculation and synchronization of data dependences

Published: 01 May 1997 Publication History
  • Get Citation Alerts
  • Abstract

    Data dependence speculation is used in instruction-level parallel (ILP) processors to allow early execution of an instruction before a logically preceding instruction on which it may be data dependent. If the instruction is independent, data dependence speculation succeeds; if not, it fails, and the two instructions must be synchronized. The modern dynamically scheduled processors that use data dependence speculation do so blindly (i.e., every load instruction with unresolved dependences is speculated). In this paper, we demonstrate that as dynamic instruction windows get larger, significant performance benefits can result when intelligent decisions about data dependence speculation are made. We propose dynamic data dependence speculation techniques: (i) to predict if the execution of an instruction is likely to result in a data dependence mis-specalation, and (ii) to provide the synchronization needed to avoid a mis-speculation. Experimental results evaluating the effectiveness of the proposed techniques are presented within the context of a Multiscalar processor.

    References

    [1]
    R. Allen and K. Kennedy. Automatic Translation of FORTRAN Progran~ to Vector Form. A CM Transactions on Programming Languages and Systems, 9(4):491.-452, Oct. 1987.
    [2]
    U. Banerj~. Dependence Analysis for Supercomputing. Boston, MA: Kluwer Aca~rnie Publishers, 1988.
    [3]
    S.E. Breach, T. N. Vijaykumar, and G. S. SoM. The anatomy of the register file in a multiscalar processor. In Proc. of the 27th Annual International Symposium on Microarchitecture, pages 181-190, Dee. 1994.
    [4]
    B. Case. What's next for Microprocessor Design. Microprocessor Report, Oct. 1995.
    [5]
    J.R. Ellis. Bulldog: A Compiler for VLIWArchitectures. Ph.D. thesis, Yale University, Feb. 1985.
    [6]
    M.Emami, R. Ghiya, and L.J. Hendren. Context-sensitive interprocedural points-to analysis in the presence of function pointers. In Proc. SIGPLAN Conf. on Programming Language Design and Implementation, pages 242-256, June 1994.
    [7]
    M. Franklin. The MultiscalarArchitecture. Ph.D. thesis, University of Wisconsin-Madison, Madison, W153706, Nov. 1993.
    [8]
    M. Franklin and G. S. Sohi. ARB: A Hardware Mechanism for Dynamic Memory Disambiguation. IEEE Transactions on Computers, 45(5):552-571, May 1996.
    [9]
    D.M. Gallagher, W. Y. Chert, S. A. Mahlke, J. C. GyUetlhaal, and W. W. Hwu. Dynamic Memory Disambiguation Using the Memory Conflict Buffer. In Proc. ASPLOS Vl, pages 183-193, Oct. 1994.
    [10]
    A.S. Huang, G. Slavenburg, and J.P. Shen. Speculative disarnbiguadon: A compilation technique for dynamic memory disambiguafion. In Proc. 21st Annual Symposium on Computer Architecture, pages 200--210, May 1994.
    [11]
    D. Hunt. Advanced performance features of the 64-bit PA-8000. In IEEE CompCon, pages 123--128, 1995.
    [12]
    PowerPC 620 RISC Microprocessor Technical Summary, IBM Order number MPR620TSU-01, Motorola Order Number MPC620/D, Oct. 1994.
    [13]
    Q. Jacobson, S. Bennett, N. Sharma, and J. Smith. Control Flow Speculation in Multiscalar Processors. In Proc. 3rd Annual International Symposium on High-Performance Computer Architecture, Feb. 1997.
    [14]
    J. Keller. The 21264: A Superscalar Alpha Processor with Out-of-Order Execution. Digital Equipment Corp., Hudson, IdA, Oct. 1996.
    [15]
    M.H. L|pasti and J. P. Shen. Exceeding the dataflow limit via value prediction. In Proc. of the 29th Annual International Symposium on Microarchitecture, Dec. 1996.
    [16]
    S.A. Mahlke, W. Y. Chert, W. W. Hwu, B. R. Rau, and M. S. Schlansker. Sentinel scheduling for VLIW and superscalar processors. In Proc. ASPLOS V, 1992.
    [17]
    A. L Moshovos, S. E. Breach, T. N. Vijaykurnar, and G. S. Sohi. A dynamic approach to improve the accuracy of data speculation. Technical Report 1316, Computer Sciences Dept., University of Wisconsin-Madison, March 1996.
    [18]
    A. Nicolau. Run-time disambiguation: Coping with statically unpredictable dependencies. IEEE Transactions on Computers, 38(5):663-678, May 1989.
    [19]
    A. Sodani and G. S. Sohi. Dynamic Instruction Reuse. In Proc. 24thlnt. Symposium on Computer Architecture, June 1997.
    [20]
    G.S. Sold, S. E. Breach, and T. N. Vijaykumar. Multiscalar pmc.assors. In Proc. 22nd Int. Symposium on Computer Architecture, pages 414-425, June 1995.
    [21]
    R.P. Wilson andM. S. Lain. Efficient Context-Sensitive Pointer Analysis for C Programs. In Proc. SIGPLAN Conf. on Programming Language Design and Implementation, pages 1-12, June 1995.

    Cited By

    View all
    • (2024)Constable: Improving Performance and Power Efficiency by Safely Eliminating Load Instruction Execution2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00017(88-102)Online publication date: 29-Jun-2024
    • (2024)Effective Context-Sensitive Memory Dependence Prediction2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00045(515-527)Online publication date: 2-Mar-2024
    • (2023)Transient-Execution Attacks: A Computer Architect PerspectiveACM Computing Surveys10.1145/360361956:3(1-38)Online publication date: 6-Oct-2023
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ISCA '97: Proceedings of the 24th annual international symposium on Computer architecture
    June 1997
    350 pages
    ISBN:0897919017
    DOI:10.1145/264107
    • cover image ACM SIGARCH Computer Architecture News
      ACM SIGARCH Computer Architecture News  Volume 25, Issue 2
      Special Issue: Proceedings of the 24th annual international symposium on Computer architecture (ISCA '97)
      May 1997
      349 pages
      ISSN:0163-5964
      DOI:10.1145/384286
      Issue’s Table of Contents

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 May 1997

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Article

    Conference

    ISCA97
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 543 of 3,203 submissions, 17%

    Upcoming Conference

    ISCA '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)216
    • Downloads (Last 6 weeks)26
    Reflects downloads up to 12 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Constable: Improving Performance and Power Efficiency by Safely Eliminating Load Instruction Execution2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00017(88-102)Online publication date: 29-Jun-2024
    • (2024)Effective Context-Sensitive Memory Dependence Prediction2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00045(515-527)Online publication date: 2-Mar-2024
    • (2023)Transient-Execution Attacks: A Computer Architect PerspectiveACM Computing Surveys10.1145/360361956:3(1-38)Online publication date: 6-Oct-2023
    • (2021)Design of Low Power Cam Memory Cell for the Next Generation Network ProcessorsIRO Journal on Sustainable Wireless Systems10.36548/jsws.2021.4.0013:4(208-218)Online publication date: 3-Dec-2021
    • (2021)Fat Loads: Exploiting Locality Amongst Contemporaneous Load Operations to Optimize Cache AccessesMICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3466752.3480104(366-379)Online publication date: 18-Oct-2021
    • (2020)A Novel, Highly Integrated Simulator for Parallel and Distributed SystemsACM Transactions on Architecture and Code Optimization10.1145/337893417:1(1-28)Online publication date: 4-Mar-2020
    • (2020)Learning from Hometown and Current CityProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/33698223:4(1-28)Online publication date: 14-Sep-2020
    • (2020)CASINO Core Microarchitecture: Generating Out-of-Order Schedules Using Cascaded In-Order Scheduling Windows2020 IEEE International Symposium on High Performance Computer Architecture (HPCA)10.1109/HPCA47549.2020.00039(383-396)Online publication date: Feb-2020
    • (2019)Filter caching for freeProceedings of the 46th International Symposium on Computer Architecture10.1145/3307650.3322269(436-448)Online publication date: 22-Jun-2019
    • (2019)An Open Source FPGA-Optimized Out-of-Order RISC-V Soft Processor2019 International Conference on Field-Programmable Technology (ICFPT)10.1109/ICFPT47387.2019.00016(63-71)Online publication date: Dec-2019
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media