Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

A Chip-Multiprocessor Architecture with Speculative Multithreading

Published: 01 September 1999 Publication History
  • Get Citation Alerts
  • Abstract

    Much emphasis is now placed on chip-multiprocessor (CMP) architectures for exploiting thread-level parallelism in an application. In such architectures, speculation may be employed to execute applications that cannot be parallelized statically. In this paper, we present an efficient CMP architecture for speculative execution of sequential binaries without source recompilation. We present the software support that enables identification of threads from a sequential binary. The hardware includes a memory disambiguation mechanism that enables the detection of interthread memory dependence violations during speculative execution. This hardware is different from past proposals in that it does not rely on a snoopy-based cache-coherence protocol. Instead, it uses an approach similar to a directory-based scheme. Furthermore, the architecture includes a simple and efficient hardware mechanism to enable register-level communication between on-chip processors. Evaluation of this software-hardware approach shows that it is quite effective in achieving high performance when running sequential binaries.

    References

    [1]
    W. Blume R. Doallo R. Eigenmann J. Grout J. Hoeflinger T. Lawrence J. Lee D. Padua Y. Paek B. Pottenger L. Rauchwerger and P. Tu, “Parallel Programming with Polaris,” Computer, vol. 29, no. 12, pp. 78-82, Dec. 1996.
    [2]
    S. Breach T.N. Vijaykumar and G. Sohi, “The Anatomy of the Register File in a Multiscalar Processor,” Proc. 27th Int'l Symp. Microarchitecture (MICRO-27), pp. 181-190, Dec. 1994.
    [3]
    R. Colwell and R. Steck, “A 0.6μm BiCMOS Processor with Dynamic Execution,” ISSCC Proc., 1995.
    [4]
    The 21264: A Superscalar Alpha Processor with Out-of-Order Execution. Microprocessor Forum, Oct. 1996.
    [5]
    P. Dubey K. O'Brien K. O'Brien and C. Barton, “Single-Program Speculative Multithreading (SPSM) Architecture: Compiler-Assisted Fine-Grained Multithreading,” Proc. IFIP WG 10. 3 Working Conf. Parallel Architectures and Compilation Techniques, PACT '95, pp. 109-121, June 1995.
    [6]
    M. Fillo S. Keckler W. Dally N. Carter A. Chang Y. Gurevich and W. Lee, “The M-Machine Multicomputer,” Proc. 28th Int'l Symp. Computer Microarchitecture (MICRO-28), pp. 146-156, Nov. 1995.
    [7]
    M. Franklin and G. Sohi, “ARB: A Hardware Mechanism for Dynamic Memory Disambiguation,” IEEE Trans. Computers, vol. 45, no. 5, pp. 552-571, May 1996.
    [8]
    S. Gopal T.N. Vijaykumar J. Smith and G. Sohi, “Speculative Versioning Cache,” Proc. Fourth Int'l Symp. High-Performance Computer Architecture, pp. 195-205, Feb. 1998.
    [9]
    M. Hall J. Anderson S. Amarasinghe B. Murphy S.-W. Liao E. Bugnion and M. Lam, “Maximizing Multiprocessor Performance with the SUIF Compiler,” Computer, vol. 29, no. 12, pp. 84-89, Dec. 1996.
    [10]
    L. Hammond B. Nayfeh and K. Olukotun, “A Single-Chip Multiprocessor,” Computer, vol. 30, no. 9, pp. 79-85, Sept. 1997.
    [11]
    L. Hammond M. Willey and K. Olukotun, “Data Speculation Support for a Chip Multiprocessor,” Proc. Eighth Int'l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS), Oct. 1998.
    [12]
    M. Johnson, Superscalar Microprocessor Design. Prentice Hall, 1990.
    [13]
    V. Krishnan and J. Torrellas, “A Direct Execution Framework for Fast and Accurate Simulation of Superscalar Processors,” Proc. PACT '98, pp, 286-293, Oct. 1998.
    [14]
    V. Krishnan and J. Torrellas, “Hardware and Software Support for Speculative Execution of Sequential Binaries on a Chip-Multiprocessor,” Proc. 12th Int'l Conf. Supercomputing (ICS), July 1998.
    [15]
    C. Lee M. Potkonjak and W. Mangione-Smith, “MediaBench: A Tool for Evaluating and Synthesizing Multimedia and Communications Systems,” Proc. 30th Int'l Symp. Microarchitecture (MICRO-30), pp. 330-335, Dec. 1997.
    [16]
    D. Levitan T. Thomas and P. Tu, “The PowerPC 620 Microprocessor,” Spring CompCon Proc., 1995.
    [17]
    P. Marcuello A. Gonzalez and J. Tubella, “Speculative Multithreaded Processors,” Proc. 12th Int'l Conf. Supercomputing (ICS), July 1998.
    [18]
    D. Matzke, “Will Physical Scalability Sabotage Performance Gains?” Computer, vol. 30, no. 9, pp. 37-39, Sept. 1997.
    [19]
    MIPS Technologies, Inc., R10000 Microprocessor Chipset, Product Overview, 1994.
    [20]
    A. Moshovos S. Breach T.N. Vijaykumar and G. Sohi, “Dynamic Speculation and Synchronization of Data Dependences,” Proc. 24th Int'l Symp. Computer Architecture, pp. 181-193, June 1997.
    [21]
    J. Oplinger D. Heine S.-W. Liao B. Nayfeh M. Lam and K. Olukotun, “Software and Hardware for Exploiting Speculative Parallelism with a Multiprocessor,” Technical Report CSL-TR-97-715, Computer Systems Laboratory, Stanford Univ., Feb. 1997.
    [22]
    S. Palacharla N. Jouppi and J. Smith, “Complexity-Effective Superscalar Processors,” Proc. 24th Int'l Symp. Computer Architecture, pp. 206-218, June 1997.
    [23]
    E. Rotenberg S. Bennett and J. Smith, “Trace Cache: A Low Latency Approach to High Bandwidth Instruction Fetching,” Proc. 29th Int'l Symp. Microarchitecture (MICRO-29), pp. 24-34, Dec. 1996.
    [24]
    E. Rotenberg Q. Jacobson Y. Sazeides and J. Smith, “Trace Processors,” Proc. 30th Int'l Symp. Microarchitecture (MICRO-30), pp. 138-148, Dec. 1997.
    [25]
    J. Smith and S. Vajapeyam, “Trace Processors: Moving to Fourth Generation Microarchitectures,” Computer, vol. 30, no. 9, pp. 68-74, Sept. 1997.
    [26]
    G. Sohi S. Breach and T.N. Vijaykumar, “Multiscalar Processors,” Proc. 22nd Int'l Symp. Computer Architecture, pp. 414-425, June 1995.
    [27]
    J. Steffan and T. Mowry, “The Potential for Using Thread-Level Data Speculation to Facilitate Automatic Parallelization,” Proc. Fourth Int'l Symp. High-Performance Computer Architecture, pp. 2-13 Feb. 1998.
    [28]
    J. Tsai and P. Yew, “The Superthreaded Architecture: Thread Pipelining with Run-Time Data Dependence Checking and Control Speculation,” Proc. PACT '96, pp. 35-46, Oct. 1996.
    [29]
    J. Veenstra and R. Fowler, “MINT: A Front End for Efficient Simulation of Shared-Memory Multiprocessors,” Proc. MASCOTS '94, pp. 201-207, Jan. 1994.
    [30]
    T.N. Vijaykumar and G. Sohi, “Task Selection for a Multiscalar Processor,” Proc. 31st Int'l Symp. Microarchitecture (MICRO-31), Dec. 1998.

    Cited By

    View all
    • (2022)A scalable architecture for reprioritizing ordered parallelismProceedings of the 49th Annual International Symposium on Computer Architecture10.1145/3470496.3527387(437-453)Online publication date: 18-Jun-2022
    • (2022)An efficient hardware supported and parallelization architecture for intelligent systems to overcome speculative overheadsInternational Journal of Intelligent Systems10.1002/int.2306237:12(11764-11790)Online publication date: 8-Sep-2022
    • (2019)LernaACM Transactions on Storage10.1145/331036815:1(1-24)Online publication date: 22-Mar-2019
    • Show More Cited By

    Index Terms

    1. A Chip-Multiprocessor Architecture with Speculative Multithreading

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image IEEE Transactions on Computers
      IEEE Transactions on Computers  Volume 48, Issue 9
      September 1999
      140 pages
      ISSN:0018-9340
      Issue’s Table of Contents

      Publisher

      IEEE Computer Society

      United States

      Publication History

      Published: 01 September 1999

      Author Tags

      1. Chip-multiprocessor
      2. control speculation.
      3. data-dependence speculation
      4. speculative multithreading

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 10 Aug 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2022)A scalable architecture for reprioritizing ordered parallelismProceedings of the 49th Annual International Symposium on Computer Architecture10.1145/3470496.3527387(437-453)Online publication date: 18-Jun-2022
      • (2022)An efficient hardware supported and parallelization architecture for intelligent systems to overcome speculative overheadsInternational Journal of Intelligent Systems10.1002/int.2306237:12(11764-11790)Online publication date: 8-Sep-2022
      • (2019)LernaACM Transactions on Storage10.1145/331036815:1(1-24)Online publication date: 22-Mar-2019
      • (2019)Heterogeneous ComputingundefinedOnline publication date: 1-Mar-2019
      • (2018)An Efficient Stream Data Processing Model for Multiuser Cryptographic ServiceJournal of Electrical and Computer Engineering10.1155/2018/39178272018Online publication date: 31-Jul-2018
      • (2018)LernaProceedings of the 11th ACM International Systems and Storage Conference10.1145/3211890.3211897(37-48)Online publication date: 4-Jun-2018
      • (2018)STRAIGHTProceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture10.1109/MICRO.2018.00019(121-133)Online publication date: 20-Oct-2018
      • (2017)Aggressive Pipelining of Irregular Applications on Reconfigurable HardwareACM SIGARCH Computer Architecture News10.1145/3140659.308022845:2(575-586)Online publication date: 24-Jun-2017
      • (2017)Aggressive Pipelining of Irregular Applications on Reconfigurable HardwareProceedings of the 44th Annual International Symposium on Computer Architecture10.1145/3079856.3080228(575-586)Online publication date: 24-Jun-2017
      • (2017)Low-level implementation of the SISC protocol for thread-level speculation on a multi-core architectureParallel Computing10.1016/j.parco.2017.07.00767:C(1-19)Online publication date: 1-Sep-2017
      • Show More Cited By

      View Options

      View options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media