Abstract
GPUs and CPUs have fundamentally different architectures. It is conventional wisdom that GPUs can accelerate only those applications that exhibit very high parallelism, especially vector parallelism such as image processing. In this paper, we explore the possibility of using GPUs for value prediction and speculative execution: we implement software value prediction techniques to accelerate programs with limited parallelism, and software speculation techniques to accelerate programs that contain runtime parallelism, which are hard to parallelize statically. Our experiment results show that due to the relatively high overhead, mapping software value prediction techniques on existing GPUs may not bring any immediate performance gain. On the other hand, although software speculation techniques introduce some overhead as well, mapping these techniques to existing GPUs can already bring some performance gain over CPU. Based on these observations, we explore the hardware implementation of speculative execution operations on GPU architectures to reduce the software performance overheads. The results indicate that the hardware extensions result in almost tenfold reduction of the control divergent sequential operations with only moderate hardware (5–8%) and power consumption (1–5%) overheads.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
IBM Cell Broadband Engine, http://www.ibm.com/developerworks/power/library/pa-cellperf/
NVIDIA Tesla Computing Solutions, http://www.nvidia.com/object/tesla_computing_solutions.html
Lipasti, M.H., Shen,J.P.: Exceeding the dataflow limit via value prediction. In: Proceedings of the 29th International Symposium on Microarchitecture, December 1996
Sazeides, Y., Smith, J.E.: The predictability of data values. In: Proceedings of the 30th Annual International Symposium on Microarchitecture, December 1997
Sodani, A., Sohi, G.S.: Understanding the differences between value prediction and instruction reuse. In: Proceedings of the 31st Annual International Symposium on Microarchitecture, December 1998
Marcuello, P., Tubella, J., González, A.: Value prediction for speculative multithreaded architectures. In: Proceedings of the 32nd Annual international Symposium on Microarchitecture (Micro’99), November 1999
Oplinger, J., Heine, D., Lam, M.S.: In search of speculative thread-level parallelism. In: Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques (PACT’99), October 1999
Liu S., Gaudiot J-L.: Potential impact of value prediction on communication in many-core architectures. IEEE Trans. Comput. 58, 6 (2009)
Knight, T.: An architecture for mostly functional languages. In: Proceedings of the ACM Lisp and Functional Programming Conference, August, 1986
Franklin M., Sohi G.S.: APB: a hardware mechanism for dynamic reordering of memory references. IEEE Trans. Comput. 45, 5 (1996)
Sohi, G.S., Breach, S., Vijaykumar, T.N.: Multiscalar Processors. In: Proceedings of the 22nd International Symposium on Computer Architecture (ISCA’95), June, 1995
Hammond L., Hubbert B.A., Siu M., Prabhu M.K., Chen M., Olukotun K.: The stanford hydra CMP. IEEE Micro 22, 2 (2000)
NVIDIA GeForce 8800, http://www.nvidia.com/page/geforce_8800.html
CUDA Zone—the resource for CUDA developers, http://www.nvidia.com/object/cuda_home.html#
Rauchwerger L., Padua D.: The LRPD test: speculative run-time parallelization of loops with privatization and reduction parallelization. ACM SIGPLAN Notices 30, 6 (1995)
SPEC CPU2006, http://www.spec.org/cpu2006/
Bienia, C., Kumar, S., Singh, J.P., Li, K.; The PARSEC benchmark suite: characterization and architectural implications, Princeton University Technical Report TR-811-08, January 2008
Intel Core i7 Processor, http://www.intel.com/products/processor/corei7/index.htm
Xilinx ML401 Overview, http://www.xilinx.com/products/boards/ml401/index.htm
Acknowledgments
This work is partly supported by the National Science Foundation under Grant No. CCF-0541403 and by the French Agence Nationale pour la Recherche (ANR) PetaQCD project. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation or of the ANR.
Open Access
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (https://creativecommons.org/licenses/by-nc/2.0), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
About this article
Cite this article
Liu, S., Eisenbeis, C. & Gaudiot, JL. Value Prediction and Speculative Execution on GPU. Int J Parallel Prog 39, 533–552 (2011). https://doi.org/10.1007/s10766-010-0155-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10766-010-0155-0