Abstract
In this paper we present a run-time mechanism to simultaneously execute multiple threads from a sequential program on a simultaneous multithreaded (SMT) processor. The threads are speculative in the sense that they are created by predicting the future control flow of the program. Moreover, threads are not necessarily independent. Data dependences among simultaneously executed threads may exist. To avoid the serialization that such dependences may cause, inter-thread dependences as well as the values that flow through them are predicted. Speculative threads correspond to different iterations of the same loop, which may significantly reduce the fetch bandwidth requirements since many instructions are shared by several threads. The performance evaluation results show a significant performance improvement when compared with a single-threaded execution, which demonstrates the potential of the mechanism to exploit unused hardware contexts. Moreover, the new processor architecture can achieve an IPC (instructions per cycle) even higher than the peak fetch bandwidth for some programs.
Preview
Unable to display preview. Download preview PDF.
References
H. Akkary and M.A. Driscoll, “A Dynamic Multithreading Processor”, in Proc. 31st. Ann. Int. Symp. on Microarchitecture, 1998.
P.K. Dubey, K. O'Brien, K.M. O'Brien and C. Barton, “Single-Program Speculative Multithreading (SPSM) Architecture: Compiler-Assisted Fine-Grained Multithreading”, in Proc. Int. Conf. on Parallel Architectures and Compilation Techniques, pp. 109–121, 1995.
M. Franklin and G.S. Sohi, “The Expandable Split Window Paradigm for Exploiting Fine Grain Parallelism”, in Proc. of Int. Symp. on Computer Architecture, pp. 58–67, 1992.
J. González and A. González, “Speculative Execution via Address Prediction and Data Prefetching”, in Proc of 11th. Int. Conf. on Supercomputing, 1997.
L. Hammond, M. Willey and K. Olukotun, “Data Speculation Support for a Chip Multiprocessor”, in Proc. of Int. Conf. on Architectural Support for Prog. Lang. and O.S., 1998.
M.H. Lipasti and J.P. Shen, “Exceeding the Dataflow Limit via Value Prediction”, in Proc. of Int. Symp. on Microarchitecture, pp. 226–237, 1996.
P. Marcuello and A. González, “Control and Data Dependence Speculation in Multithreaded Processors”, in Proc. of the Workshop on Mulithreaded Execution Architecture and Compilation held in conjuction with HPCA-4, 1998
P. Marcuello and A. González, “Speculative Multithreaded Processors”, in Proc. of the 12th Int. Conf. on Supercomputing, pp.77–84, 1998.
S. Palacharla, N.P. Jouppi and J.E. Smith, “Complexity-Effective Superscalar Processors”, in Proc. of Int. Symp. on Computer Architecture, pp. 206–218, 1997.
E. Rotenberg, S. Bennet and J.E. Smith, “Trace Processors”, in Proc. of the Int. Symp. on Microarchitecture, 1997.
E. Rotenberg, Q. Jacobson, Y. Sazeides and J.E. Smith, “Trace Cache:a Low Latency Approach to High Bandwidth Instruction Fetching”, Proc. Int. Symp. on Microarchitecture, 1996.
G.S. Sohi, S.E. Breach and T.N. Vijaykumar, “Multiscalar Processors”, in Proc. of the Int. Symp. on Computer Architecture, pp. 414–425, 1995.
J-Y. Tsai and P-C. Yew, “The Superthreaded Architecture: Thread Pipelining with Run-Time Data Dependence Checking and Control Speculation”, in Proc. Int. Conf. on Parallel Architectures and Compilation Techniques, pp. 35–46, 1996.
J. Tubella and A. González, “Control Speculation in Multithreaded Processors through Dynamic Loop Detection”, Proc. of the Int. Symp. on High-Performance Computer Architecture, 1998.
D.M. Tullsen, S.J. Eggers and H.M. Levy, “Simultaneous Multithreading: Maximizing On-Chip Parallelism”, in Proc. of the Int. Symp. on Computer Architecture, 1995.
A.K. Uht, “Concurrency Extraction via Hardware Methods Executing the Static Instruction Stream”, IEEE Trans. on Computers, vol. 41, July 1992.
S. Vajapeyam, T. Mitra, “Improving Superscalar Instruction Dispatch and Issue by Exploiting Dynamic Code Sequences”, Proc. the Int. Symp. on Comp. Architecture, 1997.
D.W. Wall, “Limits of Instruction-Level Parallelism”, Tech. Report WRL 93/6, Digital Western Research Laboratory, 1993.
S. Wallace, B. Calder and D. Tullsen, “Threaded Multiple Path Execution”, in Proc. of Int. Symp. on Computer Architecture, pp. 238–249, 1998
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1999 Springer-Verlag
About this paper
Cite this paper
Marcuello, P., González, A. (1999). Exploiting speculative thread-level parallelism on a SMT processor. In: Sloot, P., Bubak, M., Hoekstra, A., Hertzberger, B. (eds) High-Performance Computing and Networking. HPCN-Europe 1999. Lecture Notes in Computer Science, vol 1593. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0100636
Download citation
DOI: https://doi.org/10.1007/BFb0100636
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65821-4
Online ISBN: 978-3-540-48933-7
eBook Packages: Springer Book Archive