Export Citations
Superscalar and superpipelining techniques increase the overlap between the instructions in a pipelined processor, and thus these techniques have the potential to improve processor performance by decreasing the average number of cycles between the execution of adjacent instructions. Yet, to obtain this potential performance benefit, an instruction scheduler for this high-performance processor must find the independent instructions within the instruction stream of an application to execute in parallel. For non-numerical applications, there is an insufficient number of independent instructions within a basic block, and consequently the instruction scheduler must search across the basic block boundaries for the extra instruction-level parallelism required by the superscalar and superpipelining techniques. To exploit instruction-level parallelism across a conditional branch, the instruction scheduler must support the movement of instructions above a conditional branch, and the processor must support the speculative execution of these instructions.We define boosting, an architectural mechanism for speculative execution, that allows us to uncover the instruction-level parallelism across conditional branches without adversely affecting the instruction count of the application or the cycle time of the processor. Under boosting, the compiler is responsible for analyzing and scheduling instructions, while the hardware is responsible for ensuring that the effects of a speculatively-executed instruction do not corrupt the program state when the compiler is incorrect in its speculation. To experiment with boosting, we built a global instruction scheduler, which is specifically tailored for the non-numerical environment, and a simulator, which determines the cycle-count performance of our globally-scheduled programs. We also analyzed the hardware requirements for boosting in a typical load/store architecture. Through the cycle-count simulations and an understanding of the cycle-time impact of the hardware support for boosting, we found that only a small amount of hardware support for speculative execution is necessary to achieve good performance in a small-issue, superscalar processor.
Cited By
- Lu Z, Lach J, Stan M and Skadron K (2003). Alloyed branch history, International Journal of Parallel Programming, 31:2, (137-177), Online publication date: 1-Apr-2003.
- Kirovski D, Drinić M and Potkonjak M Enabling trusted software integrity Proceedings of the 10th international conference on Architectural support for programming languages and operating systems, (108-120)
- Kirovski D, Drinić M and Potkonjak M (2002). Enabling trusted software integrity, ACM SIGPLAN Notices, 37:10, (108-120), Online publication date: 1-Oct-2002.
- Kirovski D, Drinić M and Potkonjak M (2002). Enabling trusted software integrity, ACM SIGARCH Computer Architecture News, 30:5, (108-120), Online publication date: 1-Dec-2002.
- Kirovski D, Drinić M and Potkonjak M (2002). Enabling trusted software integrity, ACM SIGOPS Operating Systems Review, 36:5, (108-120), Online publication date: 1-Dec-2002.
- Martonosi M, Ofelt D and Heinrich M (1996). Integrating performance monitoring and communication in parallel computers, ACM SIGMETRICS Performance Evaluation Review, 24:1, (138-147), Online publication date: 15-May-1996.
- Martonosi M, Ofelt D and Heinrich M Integrating performance monitoring and communication in parallel computers Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, (138-147)
- Luk C Memory disambiguation for general-purpose applications Proceedings of the 1995 conference of the Centre for Advanced Studies on Collaborative research
- Laudon J, Gupta A and Horowitz M (1994). Interleaving, ACM SIGPLAN Notices, 29:11, (308-318), Online publication date: 1-Nov-1994.
- Laudon J, Gupta A and Horowitz M Interleaving Proceedings of the sixth international conference on Architectural support for programming languages and operating systems, (308-318)
- Laudon J, Gupta A and Horowitz M (1994). Interleaving, ACM SIGOPS Operating Systems Review, 28:5, (308-318), Online publication date: 1-Dec-1994.
Index Terms
- Support for speculative execution in high-performance processors
Recommendations
An evaluation of speculative instruction execution on simultaneous multithreaded processors
Modern superscalar processors rely heavily on speculative execution for performance. For example, our measurements show that on a 6-issue superscalar, 93% of committed instructions for SPECINT95 are speculative. Without speculation, processor resources ...
The impact of speculative execution on SMT processors
By executing two or more threads concurrently, Simultaneous MultiThreading (SMT) architectures are able to exploit both Instruction-Level Parallelism (ILP) and Thread-Level Parallelism (TLP) from the increased number of in-flight instructions that are ...