Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

The Vector-Thread Architecture

Published: 02 March 2004 Publication History
  • Get Citation Alerts
  • Abstract

    The vector-thread (VT) architectural paradigm unifies the vectorand multithreaded compute models. The VT abstraction providesthe programmer with a control processor and a vector of virtualprocessors (VPs). The control processor can use vector-fetch commandsto broadcast instructions to all the VPs or each VP can usethread-fetches to direct its own control flow. A seamless intermixingof the vector and threaded control mechanisms allows a VT architectureto flexibly and compactly encode application parallelismand locality, and a VT machine exploits these to improve performanceand efficiency. We present SCALE, an instantiation of theVT architecture designed for low-power and high-performance embeddedsystems. We evaluate the SCALE prototype design usingdetailed simulation of a broad range of embedded applications andshow that its performance is competitive with larger and more complexprocessors.

    References

    [1]
    {1} T.-C. Chiueh. Multi-threaded vectorization. In ISCA-18, May 1991.
    [2]
    {2} C. R. Jesshope. Implementing an efficient vector instruction set in a chip multi-processor using micro-threaded pipelines. Australia Computer Science Communications, 23(4):80-88, 2001.
    [3]
    {3} K. Kitagawa, S. Tagaya, Y. Hagihara, and Y. Kanoh. A hardware overview of SX-6 and SX-7 supercomputer. NEC Research & Development Journal, 44(1):2-7, Jan 2003.
    [4]
    {4} C. Kozyrakis. Scalable vector media-processors for embedded systems. PhD thesis, University of California at Berkeley, May 2002.
    [5]
    {5} C. Kozyrakis and D. Patterson. Overcoming the limitations of conventional vector processors. In ISCA-30, June 2003.
    [6]
    {6} C. Kozyrakis, S. Perissakis, D. Patterson, T. Anderson, K. Asanovi¿, N. Cardwell, R. Fromm, J. Golbus, B. Gribstad, K. Keeton, R. Thomas, N. Treuhaft, and K. Yelick. Scalable Processors in the Billion-Transistor Era: IRAM. IEEE Computer, 30(9):75-78, Sept 1997.
    [7]
    {7} K. Mai, T. Paaske, N. Jayasena, R. Ho, W. Dally, and M. Horowitz. Smart Memories: A modular reconfigurable architecture. In Proc. ISCA 27, pages 161-171, June 2000.
    [8]
    {8} S. Rixner, W. Dally, U. Kapasi, B. Khailany, A. Lopez-Lagunas, P. Mattson, and J. Owens. A bandwidth-efficient architecture for media processing. In MICRO-31, Nov 1998.
    [9]
    {9} R. M. Russel. The CRAY-1 computer system. Communications of the ACM, 21(1):63-72, Jan 1978.
    [10]
    {10} K. Sankaralingam, R. Nagarajan, H. Liu, C. Kim, J. Huh, D. Burger, S. W. Keckler, and C. Moore. Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture. In ISCA-30, June 2003.
    [11]
    {11} J. E. Smith. Dynamic instruction scheduling and the Astronautics ZS-1. IEEE Computer, 22(7):21-35, July 1989.
    [12]
    {12} G. S. Sohi, S. E. Breach, and T. N. Vijaykumar. Multiscalar processors. In ISCA-22, pages 414-425, June 1995.
    [13]
    {13} E. Waingold, M. Taylor, D. Srikrishna, V. Sarkar, W. Lee, V. Lee, J. Kim, M. Frank, P. Finch, R. Barua, J. Babb, S. Amarasinghe, and A. Agarwal. Baring it all to software: Raw machines. IEEE Computer, 30(9):86-93, Sept 1997.
    [14]
    {14} J. Wawrzynek, K. Asanovi¿, B. Kingsbury, J. Beck, D. Johnson, and N. Morgan. Spert-II: A vector microprocessor system. IEEE Computer, 29(3):79-86, Mar 1996.
    [15]
    {15} M. Zhang and K. Asanovi¿. Highly-associative caches for low-power processors. In Kool Chips Workshop, MICRO-33, Dec 2000.

    Cited By

    View all
    • (2023)Extension VM: Interleaved Data Layout in Vector MemoryACM Transactions on Architecture and Code Optimization10.1145/363152821:1(1-23)Online publication date: 7-Nov-2023
    • (2020)Ara: A 1-GHz+ Scalable and Energy-Efficient RISC-V Vector Processor With Multiprecision Floating-Point Support in 22-nm FD-SOIIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2019.295008728:2(530-543)Online publication date: Feb-2020
    • (2019)Towards General Purpose Acceleration by Exploiting Common Data-Dependence FormsProceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture - MICRO '5210.1145/3352460.3358276(924-939)Online publication date: 2019
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM SIGARCH Computer Architecture News
    ACM SIGARCH Computer Architecture News  Volume 32, Issue 2
    ISCA 2004
    March 2004
    373 pages
    ISSN:0163-5964
    DOI:10.1145/1028176
    Issue’s Table of Contents
    • cover image ACM Conferences
      ISCA '04: Proceedings of the 31st annual international symposium on Computer architecture
      June 2004
      373 pages
      ISBN:0769521436

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 02 March 2004
    Published in SIGARCH Volume 32, Issue 2

    Check for updates

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)45
    • Downloads (Last 6 weeks)3
    Reflects downloads up to 09 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Extension VM: Interleaved Data Layout in Vector MemoryACM Transactions on Architecture and Code Optimization10.1145/363152821:1(1-23)Online publication date: 7-Nov-2023
    • (2020)Ara: A 1-GHz+ Scalable and Energy-Efficient RISC-V Vector Processor With Multiprecision Floating-Point Support in 22-nm FD-SOIIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2019.295008728:2(530-543)Online publication date: Feb-2020
    • (2019)Towards General Purpose Acceleration by Exploiting Common Data-Dependence FormsProceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture - MICRO '5210.1145/3352460.3358276(924-939)Online publication date: 2019
    • (2017)Stream-Dataflow AccelerationProceedings of the 44th Annual International Symposium on Computer Architecture10.1145/3079856.3080255(416-429)Online publication date: 24-Jun-2017
    • (2016)Language-Extension-Based Vectorizing Compiling Scheme on SDR-DSPComputer Engineering and Technology10.1007/978-981-10-3159-5_2(15-23)Online publication date: 9-Dec-2016
    • (2015)Single-Instruction Multiple-Data ExecutionSynthesis Lectures on Computer Architecture10.2200/S00647ED1V01Y201505CAC03210:1(1-121)Online publication date: 27-May-2015
    • (2015)A variable warp size architectureACM SIGARCH Computer Architecture News10.1145/2872887.275041043:3S(489-501)Online publication date: 13-Jun-2015
    • (2015)Exploring the potential of heterogeneous von neumann/dataflow execution modelsACM SIGARCH Computer Architecture News10.1145/2872887.275038043:3S(298-310)Online publication date: 13-Jun-2015
    • (2015)A variable warp size architectureProceedings of the 42nd Annual International Symposium on Computer Architecture10.1145/2749469.2750410(489-501)Online publication date: 13-Jun-2015
    • (2015)Exploring the potential of heterogeneous von neumann/dataflow execution modelsProceedings of the 42nd Annual International Symposium on Computer Architecture10.1145/2749469.2750380(298-310)Online publication date: 13-Jun-2015
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media