Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1542275.1542333acmconferencesArticle/Chapter ViewAbstractPublication PagesicsConference Proceedingsconference-collections
research-article

Combining thread level speculation helper threads and runahead execution

Published: 08 June 2009 Publication History

Abstract

With the current trend toward multicore architectures, improved execution performance can no longer be obtained via traditional single-thread instruction level parallelism (ILP), but, instead, via multithreaded execution.Generating thread-parallel programs is hard and thread-level speculation (TLS) has been suggested as an execution model that can speculatively exploit thread-level parallelism (TLP) even when thread independence cannot be guaranteed by the programmer/compiler. Alternatively, the helper threads (HT) execution model has been proposed where subordinate threads are executed in parallel with a main thread in order to improve the execution efficiency (i.e., ILP) of the latter. Yet another execution model, runahead execution (RA), has also been proposed where subordinate versions of the main thread are dynamically created especially to cope with long-latency operations, again with the aim of improving the execution efficiency of the main thread.
Each one of these multithreaded execution models works best for different applications and application phases. In this paper we combine these three models into a single execution model and single hardware infrastructure such that the system can dynamically adapt to find the most appropriate multithreaded execution model. More specifically, TLS is favored whenever successful parallel execution of instructions in multiple threads (i.e., TLP) is possible and the system can seamlessly transition at run-time to the other models otherwise. In order to understand the tradeoffs involved, we also develop a performance model that allows one to quantitatively attribute overall performance gains to either TLP or ILP in such combined multithreaded execution model.
Experimental results show that our unified execution model achieves speedups of up to 41.2%, with an average of 10.2%, over an existing state-of-the-art TLS system and speedups of up to 35.2%, with an average of 18.3%, over a flavor of runahead execution for a subset of the SPEC2000 Int benchmark suite.

References

[1]
R. Barnes, E. Nystrom, J. Sias, S. Patel, N. Navarro, and W. M. Hwu. ''Beating In-Order Stalls with 'Fea-Ficker' Two-Pass Pipelining.'' Intl. Symp. on Microarchitecture, pages 387--398, December 2003.
[2]
L. Ceze, K. Strauss, J. Tuck, J. Renau, and J. Torrellas. ''CAVA: Using Checkpoint-Assisted Value Prediction to Hide L2 Misses.'' ACM Trans. on Architecture and Code Optimization, vol. 3, no. 2, pages 182--208, June 2006.
[3]
R. S. Chappell, J. Stark, S. P. Kim, S. K. Reinhardt, and Y. N. Patt. ''Simultaneous Subordinate Microthreading (SSMT).'' Intl. Symp. on Computer Architecture, pages 186--195, May 1999.
[4]
J. D. Collins, H. Wang, D. M. Tullsen, C. Hughes, Y.-F. Lee, D. Lavery, and J. P. Shen. ''Speculative Precomputation: Long-Range Prefetching of Delinquent Loads.'' Intl. Symp. on Computer Architecture, pages 14--25, June 2001.
[5]
C. B. Colohan, A. Ailamaki, J. G. Steffan, and T. C. Mowry. ''Tolerating Dependences Between Large Speculative Threads Via Sub-Threads.'' Intl. Symp. on Computer Architecture, pages 216--226, June 2006.
[6]
J. Dundas and T. Mudge. ''Improving Data Cache Performance by Pre-Executing Instructions Under a Cache Miss.'' Intl. Conf. on Supercomputing, pages 68--75, July 1997.
[7]
L. Hammond, M. Wiley, and K. Olukotun. ''Data Speculation Support for a Chip Multiprocessor.'' Intl. Conf. on Architectural Support for Programming Languages and Operating Systems, pages 58--69, October 1998.
[8]
S. Jourdan, J. Stark, T.-H. Hsing, and Y. N. Patt. ''Recovery Requirements of Branch Prediction Storage Structures in the Presence of Mispredicted-Path Execution.'' Intl. Journal of Parallel Programming, vol. 25, 1997.
[9]
N. Kirman, M. Kirman, M. Chaudhuri, and J. F. Martínez. ''Checkpointed Early Load Retirement.'' Intl. Symp. on High-Performance Computer Architecture, pages 16--27, February 2005.
[10]
V. Krishnan and J. Torrellas. ''Hardware and Software Support for Speculative Execution of Sequential Binaries on a Chip-Multiprocessor.'' Intl. Conf. on Supercomputing, pages 85--92, June 1998.
[11]
W. Liu, J. Tuck, L. Ceze, W. Ahn, K. Strauss, J. Renau, and J. Torrellas. ''POSH: a TLS Compiler that Exploits Program Structure.'' Symp. on Principles and Practice of Parallel Programming, pages 158--167, March 2006.
[12]
P. Marcuello and A. González. ''Clustered Speculative Multithreaded Processors.'' Intl. Conf. on Supercomputing, pages 365--372, June 1999.
[13]
O. Mutlu, J. Stark, C. Wilkerson, and Y. N. Patt. ''Runahead Execution: An Alternative to Very Large Instruction Windows.'' Intl. Symp. on High-Performance Computer Architecture, pages 129--140, February 2003.
[14]
J. Renau, B. Fraguela, J. Tuck, W. Liu, M. Prvulovic, L. Ceze, S. Sarangi, P. Sack, K. Strauss, and P. Montesinos. ''SESC simulator.'' http://sesc.sourceforge.net.
[15]
J. Renau, J. Tuck, W. Liu, L. Ceze, K. Strauss, and J. Torrellas ''Tasking with Out-Of-Order Spawn in TLS Chip Multiprocessors: Microarchitecture and Compilation.'' Intl. Conference on Supercomputing, pages 179--188, June 2005.
[16]
G. S. Sohi, S. E. Breach and T. N. Vijaykumar. ''Multiscalar Processors.'' Intl. Symp. on Computer Architecture, pages 414--425, June 1995.
[17]
J. G. Steffan and T. C. Mowry. ''The Potential for Using Thread-Level Data Speculation to Facilitate Automatic Parallelization.'' Intl. Symp. on High-Performance Computer Architecture, pages 2--13, February 1998.
[18]
K. Sundaramoorthy, Z. Purser, and E. Rotenberg. ''Slipstream Processors: Improving Both Performance and Fault Tolerance.'' Intl. Conf. on Architectural Support for Programming Languages and Operating Systems, pages 257--268, October 2000.
[19]
F. Warg. ''Techniques to Reduce Thread-Level Speculation Overhead.'' PhD Thesis, Department of Computer Science and Engineering, Chalmers University, 2006.
[20]
C. Zilles and G. Sohi. ''Execution-Based Prediction Using Speculative Slices.'' Intl. Symp. on Computer Architecture, pages 2--13, June 2001.

Cited By

View all
  • (2024)A new thread-level speculative automatic parallelization model and library based on duplicate code executionThe Journal of Supercomputing10.1007/s11227-024-05987-0Online publication date: 11-Mar-2024
  • (2020)Procedure and Loop Level Speculative Parallelism Analysis in HPECAlgorithms and Architectures for Parallel Processing10.1007/978-3-030-60245-1_4(47-60)Online publication date: 2-Oct-2020
  • (2019)An Improved Programming Model for Thread-Level Speculation2019 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom)10.1109/ISPA-BDCloud-SustainCom-SocialCom48970.2019.00101(666-672)Online publication date: Dec-2019
  • Show More Cited By

Index Terms

  1. Combining thread level speculation helper threads and runahead execution

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICS '09: Proceedings of the 23rd international conference on Supercomputing
    June 2009
    544 pages
    ISBN:9781605584980
    DOI:10.1145/1542275
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 08 June 2009

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. helper threads
    2. multi-cores
    3. runahead execution
    4. thread-level speculation

    Qualifiers

    • Research-article

    Conference

    ICS '09
    Sponsor:
    ICS '09: International Conference on Supercomputing
    June 8 - 12, 2009
    NY, Yorktown Heights, USA

    Acceptance Rates

    Overall Acceptance Rate 629 of 2,180 submissions, 29%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)12
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 06 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)A new thread-level speculative automatic parallelization model and library based on duplicate code executionThe Journal of Supercomputing10.1007/s11227-024-05987-0Online publication date: 11-Mar-2024
    • (2020)Procedure and Loop Level Speculative Parallelism Analysis in HPECAlgorithms and Architectures for Parallel Processing10.1007/978-3-030-60245-1_4(47-60)Online publication date: 2-Oct-2020
    • (2019)An Improved Programming Model for Thread-Level Speculation2019 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom)10.1109/ISPA-BDCloud-SustainCom-SocialCom48970.2019.00101(666-672)Online publication date: Dec-2019
    • (2017)DRUT: An Efficient Turbo Boost Solution via Load Balancing in Decoupled Look-Ahead Architecture2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT)10.1109/PACT.2017.35(91-104)Online publication date: Sep-2017
    • (2016)A Survey on Thread-Level Speculation TechniquesACM Computing Surveys10.1145/293836949:2(1-39)Online publication date: 30-Jun-2016
    • (2016)An OpenMP Extension that Supports Thread-Level SpeculationIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2015.239387027:1(78-91)Online publication date: 1-Jan-2016
    • (2016)Parallelizing Back Propagation Neural Network on Speculative Multicores2016 IEEE 22nd International Conference on Parallel and Distributed Systems (ICPADS)10.1109/ICPADS.2016.0121(902-907)Online publication date: Dec-2016
    • (2015)Parallelizing Block Cryptography Algorithms on Speculative MulticoresAlgorithms and Architectures for Parallel Processing10.1007/978-3-319-27119-4_1(3-15)Online publication date: 16-Dec-2015
    • (2014)Exploring speculative procedure and loop level parallelism in SPLASH2International Journal of High Performance Systems Architecture10.1504/IJHPSA.2014.0614395:2(84-92)Online publication date: 1-May-2014
    • (2014)Squashing Alternatives for Software-Based Speculative ParallelizationIEEE Transactions on Computers10.1109/TC.2013.4663:7(1826-1839)Online publication date: 1-Jul-2014
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media