Article

Tradeoffs in designing accelerator architectures for visual computing

Authors:

Daniel Johnson,

Sanjay J. PatelAuthors Info & Claims

MICRO 41: Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture

Pages 164 - 175

https://doi.org/10.1109/MICRO.2008.4771788

Published: 08 November 2008 Publication History

Abstract

Visualization, interaction, and simulation (VIS) constitute a class of applications that is growing in importance. This class includes applications such as graphics rendering, video encoding, simulation, and computer vision. These applications are ideally suited for accelerators because of their parallelizability and demand for high throughput. We compile a benchmark suite, VISBench, to serve as a proxy for this application class.

References

[1]

AGEIA PhysX. http://www.ageia.com.

[2]

MIPS32 74K. http://www.mips.com/products/cores/32-bit-cores/mips32-74k/index.cfm.

[3]

Tensilica Diamond 570T. http://www.tensilica.com/diamond/di_570t.htm.

[4]

Tilera TILE64 Processor Overview. http://www.tilera.com/pdf/Pro-Brief_Tile64_Web.pdf.

[5]

The International Technology Roadmap for Semiconductors 2005 Edition, System Drivers, 2005.

[6]

ATI CTM Guide, 2007. http://ati.amd.com/companyinfo/researcher/documents/ATI_CTM_Guide.pdf.

[7]

CUDA Programming Guide 1.0, 2007. http://developer.nvidia.com/ob-ject/cuda.html.

[8]

Aqeel Mahesri et al. Tradeoffs in designing accelerator architectures for visual computing. Technical Report UILU-ENG- 08-2208, University of Illinois, May 2008.

[9]

V. Aslot, M. Domeika, R. Eigenmann, G. Gaertner, W. B. Jones, and B. Parady. SPEComp: A New Benchmark Suite for Measuring Parallel Computer Performance. Lecture Notes in Computer Science, 2104, 2001.

Digital Library

[10]

J. Balfour and W. J. Dally. Design tradeoffs for tiled CMP on-chip networks. In ICS-20, pages 187-198, 2006.

Digital Library

[11]

C. Bienia, S. Kumar, J. P. Singh, and K. Li. The PARSEC Benchmark Suite: Characterization and Architectural Implications. Technical report, Princeton University, January 2008.

[12]

Blender.org. Blender. http://www.blender.org.

[13]

D. Bolme, M. Strout, and J. Beveridge. Faceperf: Benchmarks for face recognition algorithms. Workload Characterization, 2007. IISWC 2007, pages 114-119, 27-29 Sept. 2007.

Digital Library

[14]

A. N. Choudhary, J. H. Patel, and N. Ahuja. NETRA: A Hierarchical and Partitionable Architecture for Computer Vision Systems. IEEE Trans. Parallel Distrib. Syst., 4(10):1092- 1104, 1993.

Digital Library

[15]

R. L. Cook, L. Carpenter, and E. Catmull. The REYES image rendering architecture. In ACM SIGGRAPH, July 1987.

Digital Library

[16]

EEMBC. Embedded Microprocessor Benchmark Consortium. http://www.eembc.org.

[17]

S. V. et. al. An 80-Tile 1.28TFLOPS Network-on-Chip in 65nm CMOS. In ISSCC Digest of Technical Papers., February 2007.

[18]

W. W. Fung, I. Sham, G. Yuan, and T. M. Aamodt. Dynamic Warp Formation and Scheduling for Efficient GPU Control Flow. In Micro-40, December 2007.

Digital Library

[19]

M. Gschwind, H. P. Hofstee, B. Flachs, M. Hopkins, Y. Watanabe, and T. Yamazaki. Synergistic Processing in Cell's Multicore Architecture. IEEE Micro, 26(2):10-24, 2006.

Digital Library

[20]

P. Hester. Multi-Core and Beyond: Evolving the x86 Architecture . AMD, Aug 2007. HotChips presentation.

[21]

L. Hsu, R. Iyer, S. Makineni, S. Reinhardt, and D. Newell. Exploring the cache design space for large scale CMPs. ACM SIGARCH Computer Architecture News, 33(4):24-33, 2005.

Digital Library

[22]

J. Huh, D. Burger, and S. Keckler. Exploring the design space of future CMPs. In PACT2001, pages 199-210, 2001.

Digital Library

[23]

R. Kumar, D. M. Tullsen, and N. P. Jouppi. Core architecture optimization for heterogeneous chip multiprocessors. In PACT '06, pages 23-32, New York, NY, USA, 2006. ACM.

Digital Library

[24]

R. Kumar, V. Zyuban, and D. M. Tullsen. Interconnections in Multi-Core Architectures: Understanding Mechanisms, Overheads, and Scaling. In ISCA-32, 2005.

Digital Library

[25]

T.-J. Kwon, J. Sondeen, and J. Draper. Design trade-offs in floating-point unit implementation for embedded and processing-in-memory systems. In IEEE International Symposium on Circuits and Systems, volume 4, May 2005.

[26]

H. A. Landman. Visualizing the Behavior of Logic Synthesis Algorithms. In SNUG 98: Proceedings of the Synopsys User Group Conference, 1998.

[27]

C. Lee, M. Potkonjak, and W. H. Mangione-Smith. Media-Bench: A Tool for Evaluating and Synthesizing Multimedia and Communications Systems. In Micro-30, 1997.

Digital Library

[28]

Y. Li, B. Lee, D. Brooks, Z. Hu, and K. Skadron. CMP Design Space Exploration Subject to Physical Constraints. In HPCA-12, 2006.

[29]

D. Luebke, M. Harris, J. Krger, T. Purcell, N. Govindaraju, I. Buck, C. Woolley, and A. Lefohn. GPGPU: general purpose computation on graphics hardware. In ACM SIGGRAPH , August 2004.

Digital Library

[30]

M. Monchiero, R. Canal, and A. Gonzlez. Design space exploration for multicore architectures: a power/performance/thermalview. In ICS-20, pages 178-186, 2006.

Digital Library

[31]

N. Muralimanohar, R. Balasubramonian, and N. Jouppi. Optimizing NUCA Organizations and Wiring Alternatives for Large CachesWith CACTI 6.0. In Micro-40, December 2007.

Digital Library

[32]

S. S. Stone, H. Yi, W. mei W. Hwu, J. P. Haldar, B. P. Sutton, and Z.-P. Liang. How GPUs Can Improve the Quality of Magnetic Resonance Imaging. The 1st Workshop on GPGPU, 2007.

[33]

I. Wald, C. P. Gribble, S. Boulos, and A. Kensler. SIMD Ray Stream Tracing - SIMD Ray Traversal with Generalized Ray Packets and On-the-fly Re-Ordering. Technical Report UUSCI-2007-012, 2007.

[34]

N. Weste and D. Harris. CMOS VLSI Design: A Circuits and Systems Perspective. Addison Wesley, 2005.

Digital Library

[35]

S. C. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta. The SPLASH-2 Programs: Characterization and Methodological Considerations. In ISCA-22, pages 24-6, 1995.

Digital Library

[36]

S. Woop, J. Schmittler, and P. Slusallek. RPU: a programmable ray processing unit for realtime ray tracing. ACM Trans. Graph., 24(3):434-444, 2005.

Digital Library

[37]

L. Yang, K. Yu, J. Li, and S. Li. Prediction-based Directional Fractional Pixel Motion Estimation for H.264 Video Coding. IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005.

[38]

T. Y. Yeh, P. Faloutsos, S. J. Patel, and G. Reinmann. ParallAX: An Architecture for Real-Time Physics. In ISCA-34, 2007.

Digital Library

Cited By

Liu QChen ZYu Z(2019)MiCACM Journal on Emerging Technologies in Computing Systems10.1145/330410815:3(1-24)Online publication date: 29-Apr-2019
https://dl.acm.org/doi/10.1145/3304108
Rogers TJohnson DO'Connor MKeckler S(2015)A variable warp size architectureACM SIGARCH Computer Architecture News10.1145/2872887.275041043:3S(489-501)Online publication date: 13-Jun-2015
https://dl.acm.org/doi/10.1145/2872887.2750410
Rogers TJohnson DO'Connor MKeckler SMarr DAlbonesi D(2015)A variable warp size architectureProceedings of the 42nd Annual International Symposium on Computer Architecture10.1145/2749469.2750410(489-501)Online publication date: 13-Jun-2015
https://dl.acm.org/doi/10.1145/2749469.2750410
Show More Cited By

Tradeoffs in designing accelerator architectures for visual computing
1. General and reference
  1. Cross-computing tools and techniques

Recommendations

The tradeoffs of fused memory hierarchies in heterogeneous computing architectures
CF '12: Proceedings of the 9th conference on Computing Frontiers

With the rise of general purpose computing on graphics processing units (GPGPU), the influence from consumer markets can now be seen across the spectrum of computer architectures. In fact, many of the high-ranking Top500 HPC systems now include these ...
Tradeoffs in designing massively parallel accelerator architectures
Accelerator: using data parallelism to program GPUs for general-purpose uses
ASPLOS XII: Proceedings of the 12th international conference on Architectural support for programming languages and operating systems

GPUs are difficult to program for general-purpose uses. Programmers can either learn graphics APIs and convert their applications to use graphics pipeline operations or they can use stream programming abstractions of GPUs. We describe Accelerator, a ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MICRO 41: Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture

November 2008

483 pages

ISBN:9781424428366

Sponsors

SIGMICRO: ACM Special Interest Group on Microarchitectural Research and Processing

Publisher

IEEE Computer Society

United States

Publication History

Published: 08 November 2008

Check for updates

Qualifiers

Article

Conference

MICRO-41

Sponsor:

SIGMICRO

MICRO-41: The 41st Annual IEEE/ACM International Symposium on Microarchitecture

November 8 - 12, 2008

Acceptance Rates

MICRO 41 Paper Acceptance Rate 40 of 210 submissions, 19%;

Overall Acceptance Rate 484 of 2,242 submissions, 22%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

23
Total Citations
View Citations
653
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)0

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Liu QChen ZYu Z(2019)MiCACM Journal on Emerging Technologies in Computing Systems10.1145/330410815:3(1-24)Online publication date: 29-Apr-2019
https://dl.acm.org/doi/10.1145/3304108
Rogers TJohnson DO'Connor MKeckler S(2015)A variable warp size architectureACM SIGARCH Computer Architecture News10.1145/2872887.275041043:3S(489-501)Online publication date: 13-Jun-2015
https://dl.acm.org/doi/10.1145/2872887.2750410
Rogers TJohnson DO'Connor MKeckler SMarr DAlbonesi D(2015)A variable warp size architectureProceedings of the 42nd Annual International Symposium on Computer Architecture10.1145/2749469.2750410(489-501)Online publication date: 13-Jun-2015
https://dl.acm.org/doi/10.1145/2749469.2750410
Jae-Ho Nah Jin-Woo Kim Junho Park Won-Jong Lee Jeong-Soo Park Seok-Yoon Jung Woo-Chan Park Manocha DTack-Don Han (2015)HART: A Hybrid Architecture for Ray Tracing Animated ScenesIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2014.237185521:3(389-401)Online publication date: 1-Mar-2015
https://dl.acm.org/doi/10.1109/TVCG.2014.2371855
Nah JKwon HKim DJeong CPark JHan TManocha DPark W(2014)RayCoreACM Transactions on Graphics10.1145/262963433:5(1-15)Online publication date: 23-Sep-2014
https://dl.acm.org/doi/10.1145/2629634
Dally WBalfour J(2014)Author retrospective for design tradeoffs for tiled CMP on-chip networksACM International Conference on Supercomputing 25th Anniversary Volume10.1145/2591635.2591668(77-79)Online publication date: 10-Jun-2014
https://dl.acm.org/doi/10.1145/2591635.2591668
Lee WShin YLee JLee SRyu SKim JBillinghurst MKarlsson B(2013)Real-time ray tracing on future mobile computing platformSIGGRAPH Asia 2013 Symposium on Mobile Graphics and Interactive Applications10.1145/2543651.2543670(1-5)Online publication date: 19-Nov-2013
https://dl.acm.org/doi/10.1145/2543651.2543670
Lee WShin YLee JKim JNah JJung SLee SPark HHan TMolnar SKrüger JPurcell THunt W(2013)SGRTProceedings of the 5th High-Performance Graphics Conference10.1145/2492045.2492057(109-119)Online publication date: 19-Jul-2013
https://dl.acm.org/doi/10.1145/2492045.2492057
Lee YAvizienis RBishara AXia RLockhart DBatten CAsanović K(2013)Exploring the Tradeoffs between Programmability and Efficiency in Data-Parallel AcceleratorsACM Transactions on Computer Systems10.1145/249146431:3(1-38)Online publication date: 1-Aug-2013
https://dl.acm.org/doi/10.1145/2491464
Lashgar ABaniasadi AKhonsari A(2013)Warp size impact in GPUsProceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units10.1145/2458523.2458538(146-152)Online publication date: 16-Mar-2013
https://dl.acm.org/doi/10.1145/2458523.2458538
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten