Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

The Vector-Thread Architecture

Published: 02 March 2004 Publication History

Abstract

The vector-thread (VT) architectural paradigm unifies the vectorand multithreaded compute models. The VT abstraction providesthe programmer with a control processor and a vector of virtualprocessors (VPs). The control processor can use vector-fetch commandsto broadcast instructions to all the VPs or each VP can usethread-fetches to direct its own control flow. A seamless intermixingof the vector and threaded control mechanisms allows a VT architectureto flexibly and compactly encode application parallelismand locality, and a VT machine exploits these to improve performanceand efficiency. We present SCALE, an instantiation of theVT architecture designed for low-power and high-performance embeddedsystems. We evaluate the SCALE prototype design usingdetailed simulation of a broad range of embedded applications andshow that its performance is competitive with larger and more complexprocessors.

References

[1]
{1} T.-C. Chiueh. Multi-threaded vectorization. In ISCA-18, May 1991.
[2]
{2} C. R. Jesshope. Implementing an efficient vector instruction set in a chip multi-processor using micro-threaded pipelines. Australia Computer Science Communications, 23(4):80-88, 2001.
[3]
{3} K. Kitagawa, S. Tagaya, Y. Hagihara, and Y. Kanoh. A hardware overview of SX-6 and SX-7 supercomputer. NEC Research & Development Journal, 44(1):2-7, Jan 2003.
[4]
{4} C. Kozyrakis. Scalable vector media-processors for embedded systems. PhD thesis, University of California at Berkeley, May 2002.
[5]
{5} C. Kozyrakis and D. Patterson. Overcoming the limitations of conventional vector processors. In ISCA-30, June 2003.
[6]
{6} C. Kozyrakis, S. Perissakis, D. Patterson, T. Anderson, K. Asanovi¿, N. Cardwell, R. Fromm, J. Golbus, B. Gribstad, K. Keeton, R. Thomas, N. Treuhaft, and K. Yelick. Scalable Processors in the Billion-Transistor Era: IRAM. IEEE Computer, 30(9):75-78, Sept 1997.
[7]
{7} K. Mai, T. Paaske, N. Jayasena, R. Ho, W. Dally, and M. Horowitz. Smart Memories: A modular reconfigurable architecture. In Proc. ISCA 27, pages 161-171, June 2000.
[8]
{8} S. Rixner, W. Dally, U. Kapasi, B. Khailany, A. Lopez-Lagunas, P. Mattson, and J. Owens. A bandwidth-efficient architecture for media processing. In MICRO-31, Nov 1998.
[9]
{9} R. M. Russel. The CRAY-1 computer system. Communications of the ACM, 21(1):63-72, Jan 1978.
[10]
{10} K. Sankaralingam, R. Nagarajan, H. Liu, C. Kim, J. Huh, D. Burger, S. W. Keckler, and C. Moore. Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture. In ISCA-30, June 2003.
[11]
{11} J. E. Smith. Dynamic instruction scheduling and the Astronautics ZS-1. IEEE Computer, 22(7):21-35, July 1989.
[12]
{12} G. S. Sohi, S. E. Breach, and T. N. Vijaykumar. Multiscalar processors. In ISCA-22, pages 414-425, June 1995.
[13]
{13} E. Waingold, M. Taylor, D. Srikrishna, V. Sarkar, W. Lee, V. Lee, J. Kim, M. Frank, P. Finch, R. Barua, J. Babb, S. Amarasinghe, and A. Agarwal. Baring it all to software: Raw machines. IEEE Computer, 30(9):86-93, Sept 1997.
[14]
{14} J. Wawrzynek, K. Asanovi¿, B. Kingsbury, J. Beck, D. Johnson, and N. Morgan. Spert-II: A vector microprocessor system. IEEE Computer, 29(3):79-86, Mar 1996.
[15]
{15} M. Zhang and K. Asanovi¿. Highly-associative caches for low-power processors. In Kool Chips Workshop, MICRO-33, Dec 2000.

Cited By

View all
  • (2023)Extension VM: Interleaved Data Layout in Vector MemoryACM Transactions on Architecture and Code Optimization10.1145/363152821:1(1-23)Online publication date: 7-Nov-2023
  • (2020)Ara: A 1-GHz+ Scalable and Energy-Efficient RISC-V Vector Processor With Multiprecision Floating-Point Support in 22-nm FD-SOIIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2019.295008728:2(530-543)Online publication date: Feb-2020
  • (2019)Towards General Purpose Acceleration by Exploiting Common Data-Dependence FormsProceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture - MICRO '5210.1145/3352460.3358276(924-939)Online publication date: 2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGARCH Computer Architecture News
ACM SIGARCH Computer Architecture News  Volume 32, Issue 2
ISCA 2004
March 2004
373 pages
ISSN:0163-5964
DOI:10.1145/1028176
Issue’s Table of Contents
  • cover image ACM Conferences
    ISCA '04: Proceedings of the 31st annual international symposium on Computer architecture
    June 2004
    373 pages
    ISBN:0769521436

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 March 2004
Published in SIGARCH Volume 32, Issue 2

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)46
  • Downloads (Last 6 weeks)5
Reflects downloads up to 15 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Extension VM: Interleaved Data Layout in Vector MemoryACM Transactions on Architecture and Code Optimization10.1145/363152821:1(1-23)Online publication date: 7-Nov-2023
  • (2020)Ara: A 1-GHz+ Scalable and Energy-Efficient RISC-V Vector Processor With Multiprecision Floating-Point Support in 22-nm FD-SOIIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2019.295008728:2(530-543)Online publication date: Feb-2020
  • (2019)Towards General Purpose Acceleration by Exploiting Common Data-Dependence FormsProceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture - MICRO '5210.1145/3352460.3358276(924-939)Online publication date: 2019
  • (2017)Stream-Dataflow AccelerationProceedings of the 44th Annual International Symposium on Computer Architecture10.1145/3079856.3080255(416-429)Online publication date: 24-Jun-2017
  • (2016)Language-Extension-Based Vectorizing Compiling Scheme on SDR-DSPComputer Engineering and Technology10.1007/978-981-10-3159-5_2(15-23)Online publication date: 9-Dec-2016
  • (2015)Single-Instruction Multiple-Data ExecutionSynthesis Lectures on Computer Architecture10.2200/S00647ED1V01Y201505CAC03210:1(1-121)Online publication date: 27-May-2015
  • (2015)A variable warp size architectureACM SIGARCH Computer Architecture News10.1145/2872887.275041043:3S(489-501)Online publication date: 13-Jun-2015
  • (2015)Exploring the potential of heterogeneous von neumann/dataflow execution modelsACM SIGARCH Computer Architecture News10.1145/2872887.275038043:3S(298-310)Online publication date: 13-Jun-2015
  • (2015)A variable warp size architectureProceedings of the 42nd Annual International Symposium on Computer Architecture10.1145/2749469.2750410(489-501)Online publication date: 13-Jun-2015
  • (2015)Exploring the potential of heterogeneous von neumann/dataflow execution modelsProceedings of the 42nd Annual International Symposium on Computer Architecture10.1145/2749469.2750380(298-310)Online publication date: 13-Jun-2015
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media