Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Dynamically allocating processor resources between nearby and distant ILP

Published: 01 May 2001 Publication History
  • Get Citation Alerts
  • Abstract

    Modern superscalar processors use wide instruction issue widths and out-of-order execution in order to increase instruction-level parallelism (ILP). Because instructions must be committed in order so as to guarantee precise exceptions, increasing ILP implies increasing the sizes of structures such as the register file, issue queue, and reorder buffer. Simultaneously, cycle time constraints limit the sizes of these structures, resulting in conflicting design requirements.
    In this paper, we present a novel microarchitecture designed to overcome the limitations of a register file size dictated by cycle time constraints. Available registers are dynamically allocated between the primary program thread and a future thread. The future thread executes instructions when the primary thread is limited by resource availability. The future thread is nor constrained by in order commit requirements. It is therefore able to examine a much larger instruction window and jump far ahead to execute ready instructions. Results are communicated back to the primary thread by warming up the register file, instruction cache, data cache, and instruction reuse buffer, and by resolving branch mispredicts early. The proposed microarchitecture is able to get on overall speedup of 1.17 over the base processor for our benchmark set, with speedups of up to 1.64.

    References

    [1]
    H. Akkary and M. Driscoll. A Dynamic Multithreading Processor. In Proceedings of MICRO-31, pages 226-236, 1998.
    [2]
    R. Balasubramonian, D. Albonesi, A. Buyuktosunoglu, and S. Dwarkadas. Memory Hierarchy Reconfiguration for Energy and Performance in General- Purpose Processor Architectures. In Proceedings of MICRO-33, pages 245-257, Dec 2000.
    [3]
    R. Balasubramonian, S. Dwarkadas, and D. Albonesi. Dynamically Allocating Processor Resources between Nearby and Distant ILP. Technical Report 743, University of Rochester, Apr 2001.
    [4]
    D. Burger and T. Austin. The Simplescalar Toolset, Version 2.0. Technical Report TR-97-1342, University of Wisconsin-Madison, June 1997.
    [5]
    R. Chappell, J. Stark, S. Kim, S. Reinhardt, and Y. Patt. Simultaneous Subordinate Microthreading (SSMT). In Proceedings oflSCA, 1999.
    [6]
    T. Chen and J. Baer. Effective Hardware Based Data Prefetching for High Performance Processors. IEEE Transactions on Computers, 44(5):609-623, May 1995.
    [7]
    J.-L. Cruz, A. Gonzalez, M. Valero, and N. E Topham. Multiple-Banked Register File Architectures. In Proceedings of the 27th 1SCA, pages 316-325, 2000.
    [8]
    D. Bailey, et al. The NAS Parallel Benchmarks. Technical Report TR RNR-94-007, NASA Ames Research Center, March 1994.
    [9]
    M. Dubois and Y. H. Song. Assisted Execution. Technical Report CENG 98-25, EE-Systems, University of Southern California, Oct 1998.
    [10]
    J. Dundas and T. Mudge. Improving Data Cache Performance by Pre-executing Instructions Under a Cache Miss. In Proceedings oflCS, 1997.
    [11]
    A. Farcy, O. Temam, R. Espasa, and T. Juan. Dataflow Analysis of Branch Mispredictions and Its Application to Early Resolution of Branch Outcomes. In Proceedings of MICRO-31, pages 59-68, 1998.
    [12]
    K. Farkas, N. Jouppi, and E Chow. Register File Considerations in Dynamically Scheduled Processors. In Proceedings of HPCA, 1996.
    [13]
    N. Jouppi. Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache and Prefetch Buffers. In Proceedings oflSCA, 1990.
    [14]
    R. Kessler. The Alpha 21264 Microprocessor. IEEE Micro, 19(2), March/April 1999.
    [15]
    C.-K. Luk. Tolerating Memory Latency through Software-Controlled Pre-Execution in Simultaneous Multithreading Processors. In Proceedings of the 28th ISCA, 2001.
    [16]
    C.-K. Luk and T. Mowry. Compiler-based Prefetching for Recursive Data Structures. In Proceedings of ASPLOS VII, pages 222-233, 1996.
    [17]
    T. Monreal, A. Gonzalez, M. Valero, J. Gonzalez, and V. Vinals. Delaying Physical Register Allocation through Virtual-Physical Registers. In Proceedings of MICRO-32, pages 186-192, Nov 1999.
    [18]
    M. Moudgill, K. Pingali, and S. Vassiliadis. Register Renaming and Dynamic Speculation: an Alternative Approach. In Proceedings of MICRO, 1993.
    [19]
    T. Mowry, M. Lam, and A. Gupta. Design and Evaluation of a Compiler Algorithm for Prefetching. In Proceedings of ASPLOS-V, pages 62-73, 1992.
    [20]
    V. Pai and S. Adve. Code Transformations to Improve Memory Parallelism. In Proceedings of MICRO-32, pages 147-155, 1999.
    [21]
    S. Palacharla, N. Jouppi, and J. Smith. Complexity- Effective Superscalar Processors. In Proceedings of ISCA, pages 206-218, 1997.
    [22]
    S. Reinhardt and S. Mukherjee. Transient Fault Detection via Simultaneous Multithreading. In Proceedings of the 27th ISCA, pages 25-36, 2000.
    [23]
    A. Rogers, M. Carlisle, J. Reppy, and L. Hendren. Supporting Dynamic Data Structures on Distributed Memory Machines. ACM Transactions on Programming Languages and Systems, Mar 1995.
    [24]
    E. Rotenberg. AR-SMT: A Microarchitectural Approach to Fault Tolerance in Microprocessors. In Proceedings of FTCS, 1999.
    [25]
    E. Rotenberg, Q. Jacobson, Y. Sazeides, and J. Smith. Trace Processors. In Proceedings of MICRO-30, 1997.
    [26]
    A. Roth, A. Moshovos, and G. Sohi. Dependence Based Prefetching for Linked Data Structures. In Proceedings of ASPLOS VIII, pages 115-126, 1998.
    [27]
    A. Roth, A. Moshovos, and G. Sohi. Improving Virtual Function Call Target Prediction via Dependencebased Pre-computation. In Proceedings of ICS, 1999.
    [28]
    A. Roth and G. Sohi. Speculative Data-Driven Multithreading. In Proceedings of HPCA-7, 2001.
    [29]
    A. Sodani and G. Sohi. Dynamic Instruction Reuse. In Proceedings oflSCA, pages 194-205, 1997.
    [30]
    G. Sohi, S. Breach, and T. Vijaykumar. Multiscalar Processors. In Proceedings oflSCA, 1995.
    [31]
    J. Steffan and T. Mowry. The Potential for Using Thread Level Data-Speculation to Facilitate Automatic Parallelization. In Proceedings of HPCA 4, 1998.
    [32]
    K. Sundaramoorthy, Z. Purser, and E. Rotenberg. Slipstream Processors: Improving both Performance and Fault Tolerance. In Proceedings of ASPLOS-IX, 2000.
    [33]
    D. Tullsen, S. Eggers, and H. Levy. Simultaneous Multithreading: Maximizing On-Chip Parallelism. In Proceedings oflSCA, pages 392-403, 1995.
    [34]
    S. Wallace and N. Bagherzadeh. A Scalable Register File Architecture for Dynamically Scheduled Processors. In Proceedings of PACT, Oct 1996.
    [35]
    K. Yeager. The MIPS R10000 Superscalar Microprocessor. IEEE Micro, 16(2), April 1996.
    [36]
    C. Zilles and G. Sohi. Understanding the Backward Slices of Performance Degrading Instructions. In Proceedings of lSCA, pages 172-181,2000.

    Cited By

    View all
    • (2015)Accelerating asynchronous programs through event sneak peekACM SIGARCH Computer Architecture News10.1145/2872887.275037343:3S(642-654)Online publication date: 13-Jun-2015
    • (2015)Accelerating asynchronous programs through event sneak peekProceedings of the 42nd Annual International Symposium on Computer Architecture10.1145/2749469.2750373(642-654)Online publication date: 13-Jun-2015
    • (2020)Precise Runahead Execution2020 IEEE International Symposium on High Performance Computer Architecture (HPCA)10.1109/HPCA47549.2020.00040(397-410)Online publication date: Feb-2020
    • Show More Cited By

    Index Terms

    1. Dynamically allocating processor resources between nearby and distant ILP

                                Recommendations

                                Comments

                                Information & Contributors

                                Information

                                Published In

                                cover image ACM SIGARCH Computer Architecture News
                                ACM SIGARCH Computer Architecture News  Volume 29, Issue 2
                                Special Issue: Proceedings of the 28th annual international symposium on Computer architecture (ISCA '01)
                                May 2001
                                262 pages
                                ISSN:0163-5964
                                DOI:10.1145/384285
                                Issue’s Table of Contents
                                • cover image ACM Conferences
                                  ISCA '01: Proceedings of the 28th annual international symposium on Computer architecture
                                  June 2001
                                  289 pages
                                  ISBN:0769511627
                                  DOI:10.1145/379240

                                Publisher

                                Association for Computing Machinery

                                New York, NY, United States

                                Publication History

                                Published: 01 May 2001
                                Published in SIGARCH Volume 29, Issue 2

                                Check for updates

                                Qualifiers

                                • Article

                                Contributors

                                Other Metrics

                                Bibliometrics & Citations

                                Bibliometrics

                                Article Metrics

                                • Downloads (Last 12 months)6
                                • Downloads (Last 6 weeks)0

                                Other Metrics

                                Citations

                                Cited By

                                View all
                                • (2015)Accelerating asynchronous programs through event sneak peekACM SIGARCH Computer Architecture News10.1145/2872887.275037343:3S(642-654)Online publication date: 13-Jun-2015
                                • (2015)Accelerating asynchronous programs through event sneak peekProceedings of the 42nd Annual International Symposium on Computer Architecture10.1145/2749469.2750373(642-654)Online publication date: 13-Jun-2015
                                • (2020)Precise Runahead Execution2020 IEEE International Symposium on High Performance Computer Architecture (HPCA)10.1109/HPCA47549.2020.00040(397-410)Online publication date: Feb-2020
                                • (2016)A Survey of Recent Prefetching Techniques for Processor CachesACM Computing Surveys10.1145/290707149:2(1-35)Online publication date: 2-Aug-2016
                                • (2015)Accelerating asynchronous programs through event sneak peekACM SIGARCH Computer Architecture News10.1145/2872887.275037343:3S(642-654)Online publication date: 13-Jun-2015
                                • (2015)Ramsey-Based Inclusion Checking for Visibly Pushdown AutomataACM Transactions on Computational Logic10.1145/277422116:4(1-24)Online publication date: 26-Aug-2015
                                • (2015)Accelerating asynchronous programs through event sneak peekProceedings of the 42nd Annual International Symposium on Computer Architecture10.1145/2749469.2750373(642-654)Online publication date: 13-Jun-2015
                                • (2015)Submitted to IEEE Transactions on Parallel and Distributed Systems Special Issue on CMP ArchitecturesIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2007.1080(1-1)Online publication date: 2015
                                • (2014)A Case for a Flexible Scalar Unit in SIMT ArchitectureProceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium10.1109/IPDPS.2014.21(93-102)Online publication date: 19-May-2014
                                • (2012)Adaptive processor architecture - invited paper2012 International Conference on Embedded Computer Systems (SAMOS)10.1109/SAMOS.2012.6404181(244-251)Online publication date: Jul-2012
                                • Show More Cited By

                                View Options

                                Get Access

                                Login options

                                View options

                                PDF

                                View or Download as a PDF file.

                                PDF

                                eReader

                                View online with eReader.

                                eReader

                                Media

                                Figures

                                Other

                                Tables

                                Share

                                Share

                                Share this Publication link

                                Share on social media