Value Prediction and Speculative Execution on GPU

Liu, Shaoshan; Eisenbeis, Christine; Gaudiot, Jean-Luc

doi:10.1007/s10766-010-0155-0

Value Prediction and Speculative Execution on GPU

Open access
Published: 01 December 2010

Volume 39, pages 533–552, (2011)
Cite this article

Download PDF

You have full access to this open access article

International Journal of Parallel Programming Aims and scope Submit manuscript

Value Prediction and Speculative Execution on GPU

Download PDF

Shaoshan Liu¹,
Christine Eisenbeis² &
Jean-Luc Gaudiot³

1011 Accesses
12 Citations
3 Altmetric
Explore all metrics

Abstract

GPUs and CPUs have fundamentally different architectures. It is conventional wisdom that GPUs can accelerate only those applications that exhibit very high parallelism, especially vector parallelism such as image processing. In this paper, we explore the possibility of using GPUs for value prediction and speculative execution: we implement software value prediction techniques to accelerate programs with limited parallelism, and software speculation techniques to accelerate programs that contain runtime parallelism, which are hard to parallelize statically. Our experiment results show that due to the relatively high overhead, mapping software value prediction techniques on existing GPUs may not bring any immediate performance gain. On the other hand, although software speculation techniques introduce some overhead as well, mapping these techniques to existing GPUs can already bring some performance gain over CPU. Based on these observations, we explore the hardware implementation of speculative execution operations on GPU architectures to reduce the software performance overheads. The results indicate that the hardware extensions result in almost tenfold reduction of the control divergent sequential operations with only moderate hardware (5–8%) and power consumption (1–5%) overheads.

Article PDF

Optimistic Parallelism on GPUs

A Cache-Aware Performance Prediction Framework for GPGPU Computations

A Throughput-Aware Analytical Performance Model for GPU Applications

Discover the latest articles, news and stories from top researchers in related subjects.

Quantum Computing

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

IBM Cell Broadband Engine, http://www.ibm.com/developerworks/power/library/pa-cellperf/
NVIDIA Tesla Computing Solutions, http://www.nvidia.com/object/tesla_computing_solutions.html
Lipasti, M.H., Shen,J.P.: Exceeding the dataflow limit via value prediction. In: Proceedings of the 29th International Symposium on Microarchitecture, December 1996
Sazeides, Y., Smith, J.E.: The predictability of data values. In: Proceedings of the 30th Annual International Symposium on Microarchitecture, December 1997
Sodani, A., Sohi, G.S.: Understanding the differences between value prediction and instruction reuse. In: Proceedings of the 31st Annual International Symposium on Microarchitecture, December 1998
Marcuello, P., Tubella, J., González, A.: Value prediction for speculative multithreaded architectures. In: Proceedings of the 32nd Annual international Symposium on Microarchitecture (Micro’99), November 1999
Oplinger, J., Heine, D., Lam, M.S.: In search of speculative thread-level parallelism. In: Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques (PACT’99), October 1999
Liu S., Gaudiot J-L.: Potential impact of value prediction on communication in many-core architectures. IEEE Trans. Comput. 58, 6 (2009)
Article MathSciNet Google Scholar
Knight, T.: An architecture for mostly functional languages. In: Proceedings of the ACM Lisp and Functional Programming Conference, August, 1986
Franklin M., Sohi G.S.: APB: a hardware mechanism for dynamic reordering of memory references. IEEE Trans. Comput. 45, 5 (1996)
Article Google Scholar
Sohi, G.S., Breach, S., Vijaykumar, T.N.: Multiscalar Processors. In: Proceedings of the 22nd International Symposium on Computer Architecture (ISCA’95), June, 1995
Hammond L., Hubbert B.A., Siu M., Prabhu M.K., Chen M., Olukotun K.: The stanford hydra CMP. IEEE Micro 22, 2 (2000)
Google Scholar
NVIDIA GeForce 8800, http://www.nvidia.com/page/geforce_8800.html
CUDA Zone—the resource for CUDA developers, http://www.nvidia.com/object/cuda_home.html#
Rauchwerger L., Padua D.: The LRPD test: speculative run-time parallelization of loops with privatization and reduction parallelization. ACM SIGPLAN Notices 30, 6 (1995)
Article Google Scholar
SPEC CPU2006, http://www.spec.org/cpu2006/
Bienia, C., Kumar, S., Singh, J.P., Li, K.; The PARSEC benchmark suite: characterization and architectural implications, Princeton University Technical Report TR-811-08, January 2008
Intel Core i7 Processor, http://www.intel.com/products/processor/corei7/index.htm
Xilinx ML401 Overview, http://www.xilinx.com/products/boards/ml401/index.htm

Download references

Acknowledgments

This work is partly supported by the National Science Foundation under Grant No. CCF-0541403 and by the French Agence Nationale pour la Recherche (ANR) PetaQCD project. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation or of the ANR.

Open Access

This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

Author information

Authors and Affiliations

Microsoft, Redmond, WA, USA
Shaoshan Liu
Alchemy team, INRIA Saclay - Île-de-France & Univ Paris-Sud 11 (LRI, UMR CNRS 8623), Orsay, 91405, France
Christine Eisenbeis
University of California, Irvine, CA, USA
Jean-Luc Gaudiot

Authors

Shaoshan Liu
View author publications
You can also search for this author in PubMed Google Scholar
Christine Eisenbeis
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Luc Gaudiot
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shaoshan Liu.

Rights and permissions

Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (https://creativecommons.org/licenses/by-nc/2.0), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

Reprints and permissions

About this article

Cite this article

Liu, S., Eisenbeis, C. & Gaudiot, JL. Value Prediction and Speculative Execution on GPU. Int J Parallel Prog 39, 533–552 (2011). https://doi.org/10.1007/s10766-010-0155-0

Download citation

Received: 10 September 2010
Accepted: 08 November 2010
Published: 01 December 2010
Issue Date: October 2011
DOI: https://doi.org/10.1007/s10766-010-0155-0

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Value Prediction and Speculative Execution on GPU

Abstract

Article PDF

Similar content being viewed by others

Optimistic Parallelism on GPUs

A Cache-Aware Performance Prediction Framework for GPGPU Computations

A Throughput-Aware Analytical Performance Model for GPU Applications

References

Acknowledgments

Open Access

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Value Prediction and Speculative Execution on GPU

Abstract

Article PDF

Similar content being viewed by others

Optimistic Parallelism on GPUs

A Cache-Aware Performance Prediction Framework for GPGPU Computations

A Throughput-Aware Analytical Performance Model for GPU Applications

Explore related subjects

References

Acknowledgments

Open Access

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation