Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1071866.1071878acmconferencesArticle/Chapter ViewAbstractPublication PageshpgConference Proceedingsconference-collections

A reconfigurable architecture for load-balanced rendering

Published: 30 July 2005 Publication History


Commodity graphics hardware has become increasingly programmable over the last few years but has been limited to fixed resource allocation. These architectures handle some workloads well, others poorly; load-balancing to maximize graphics hardware performance has become a critical issue. In this paper, we explore one solution to this problem using compile-time resource allocation. For our experiments, we implement a graphics pipeline on Raw, a tile-based multicore processor. We express both the full graphics pipeline and the shaders using StreamIt, a high-level language based on the stream programming model. The programmer specifies the number of tiles per pipeline stage, and the StreamIt compiler maps the computation to the Raw architecture.We evaluate our reconfigurable architecture using a mix of common rendering tasks with different workloads and improve throughput by 55-157% over a static allocation. Although our early prototype cannot compete in performance against commercial state-of-the-art graphics processors, we believe that this paper describes an important first step in addressing the load-balancing challenge.


{BFH*04} Buck I., Foley T., Horn D., Sugerman J., Fatahalian K., Houston M., Hanrahan P.: Brook for GPUs: Stream Computing on Graphics Hardware. ACM Trans. Graph. 23, 3 (2004).]]
{Bly05} Blythe D.: Windows Graphics Overview. WINHEC 2005 (2005).]]
{EIH00} Eldridge M., Igehy H., Hanrahan P.: Pomegranate: A Fully Scalable Graphics Architecture. In SIGGRAPH (2000).]]
{EMP*97} Eyles J., Molnar S., Poulton J., Greer T., Lastra A., England N., Westover L.: PixelFlow: The Realization. In SIGGRAPH / Eurographics Workshop on Graphics hardware (1997).]]
{GTK*02} Gordon M., Thies W., Karczmarek M., Lin J., Meli A. S., Leger C., Lamb A. A., Wong J., Hoffman H., Maze D. Z., Amarasinghe S.: A Stream Compiler for Communication-Exposed Architectures. In ASPLOS (2002).]]
{HHN*02} Humphreys G., Houston M., Ng R., Frank R., Ahern S., Kirchner P. D., Klosowski J. T.: Chromium: A Stream-Processing Framework for Interactive Rendering on Clusters. In SIGGRAPH (2002).]]
{ISH98} Igehy H., Stoll G., Hanrahan P.: The Design of a Parallel Graphics Interface. In SIGGRAPH (1998).]]
{LKM01} Lindholm E., Kilgard M. J., Moreton H.: A User-Programmable Vertex Engine. In SIGGRAPH (2001).]]
{MCEF94} Molnar S., Cox M., Ellsworth D., Fuchs H.: A Sorting Classification of Parallel Rendering. IEEE Comput. Graph. Appl. 14, 4 (1994).]]
{MGAK03} Mark W. R., Glanville R. S., Akeley K., Kilgard M. J.: Cg: A System for Programming Graphics Hardware in a C-like Language. ACM Trans. Graph. 22, 3 (2003).]]
{MQP02} McCool M. D., Qin Z., Popa T. S.: Shader Metaprogramming. In Graphics Hardware (2002).]]
{MTP*04} McCool M., Toit S. D., Popa T., Chan B., Moule K.: Shader Algebra. In SIGGRAPH (2004).]]
{NK96} Nishimura S., Kunii T. L.: VC-1: A Scalable Graphics Computer with Virtual Local Frame Buffers. In SIGGRAPH (1996).]]
{ODK*00} Owens J. D., Dally W. J., Kapasi U. J., Rixner S., Mattson P., Mowery B.: Polygon Rendering on a Stream Architecture. In Graphics Hardware (2000).]]
{OG97} Olano M., Greer T.: Triangle Scan Conversion using 2D Homogeneous Coordinates. In Graphics Hardware (1997).]]
{PH89} Potmesil M., Hoffert E. M.: The Pixel Machine: A Parallel Image Computer. SIGGRAPH (1989).]]
{PMTH01} Proudfoot K., Mark W. R., Tzvetkov S., Hanrahan P.: A Real-time Procedural Shading System for Programmable Graphics Hardware. In SIGGRAPH (2001).]]
{Sch04} Scheuermann T.: Advanced Depth of Field. GDC 2004 (2004).]]
{TKA02} Thies W., Karczmarek M., Amarasinghe S.: StreamIt: A Language for Streaming Applications. In International Conference on Compiler Construction (2002).]]
{TKM*02} Taylor M. B., Kim J., Miller J., Wentzlaff D., Ghodrat F., Greenwald B., Hoffmann H., Johnson P., Lee J.-W., Lee W., Ma A., Saraf A., Seneski M., Shnidman N., Strumpen V., Frank M., Amarasinghe S., Agarwal A.: The Raw Microprocessor: A Computational Fabric for Software Circuits and General Purpose Programs. IEEE Micro (2002).]]
{TLAA03} Taylor M. B., Lee W., Amarasinghe S., Agarwal A.: Scalar Operand Networks: On-Chip Interconnect for ILP in Partitioned Architectures. In HPCA (2003).]]
{TLM*04} Taylor M. B., Lee W., Miller J., Wentzlaff D., Bratt I., Greenwald B., Hoffmann H., Johnson P., Kim J., Psota J., Saraf A., Shnidman N., Strumpen V., Frank M., Amarasinghe S., Agarwal A.: Evaluation of the Raw Microprocessor: An Exposed-Wire-Delay Architecture for ILP and Streams. In ISCA (2004).]]

Cited By

View all
  • (2019)"Deep reinforcement learning for search, recommendation, and online advertising: a survey" by Xiangyu Zhao, Long Xia, Jiliang Tang, and Dawei Yin with Martin Vesely as coordinatorACM SIGWEB Newsletter10.1145/3320496.33205002019:Spring(1-15)Online publication date: 29-Jul-2019
  • (2019)"Are you an influencer, or a lurker? why not both! understanding alternate, opposite behaviors in complex social network systems" by Diego Perna, Roberto Interdonato, and Andrea Tagarelli with Martin Vesely as coordinatorACM SIGWEB Newsletter10.1145/3320496.33204992019:Spring(1-8)Online publication date: 29-Jul-2019
  • (2017)Effective static bin patterns for sort-middle renderingProceedings of High Performance Graphics10.1145/3105762.3105777(1-10)Online publication date: 28-Jul-2017
  • Show More Cited By



Information & Contributors


Published In

cover image ACM Conferences
HWWS '05: Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
July 2005
121 pages
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]



Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 July 2005


Request permissions for this article.

Check for updates


  • Article


GH05: Graphics Hardware 2005
July 30 - 31, 2005
California, Los Angeles

Acceptance Rates

Overall Acceptance Rate 37 of 94 submissions, 39%


Other Metrics

Bibliometrics & Citations


Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)0
Reflects downloads up to 18 Jan 2025

Other Metrics


Cited By

View all
  • (2019)"Deep reinforcement learning for search, recommendation, and online advertising: a survey" by Xiangyu Zhao, Long Xia, Jiliang Tang, and Dawei Yin with Martin Vesely as coordinatorACM SIGWEB Newsletter10.1145/3320496.33205002019:Spring(1-15)Online publication date: 29-Jul-2019
  • (2019)"Are you an influencer, or a lurker? why not both! understanding alternate, opposite behaviors in complex social network systems" by Diego Perna, Roberto Interdonato, and Andrea Tagarelli with Martin Vesely as coordinatorACM SIGWEB Newsletter10.1145/3320496.33204992019:Spring(1-8)Online publication date: 29-Jul-2019
  • (2017)Effective static bin patterns for sort-middle renderingProceedings of High Performance Graphics10.1145/3105762.3105777(1-10)Online publication date: 28-Jul-2017
  • (2016)A machine learning approach to mapping streaming workloads to dynamic multicore processorsACM SIGPLAN Notices10.1145/2980930.290795151:5(113-122)Online publication date: 13-Jun-2016
  • (2016)HCloudACM SIGARCH Computer Architecture News10.1145/2980024.287236544:2(473-488)Online publication date: 25-Mar-2016
  • (2016)A machine learning approach to mapping streaming workloads to dynamic multicore processorsProceedings of the 17th ACM SIGPLAN/SIGBED Conference on Languages, Compilers, Tools, and Theory for Embedded Systems10.1145/2907950.2907951(113-122)Online publication date: 13-Jun-2016
  • (2013)Flexible filters in stream programsACM Transactions on Embedded Computing Systems10.1145/2539036.253904113:3(1-26)Online publication date: 24-Dec-2013
  • (2011)Dynamic Fine-Grain Scheduling of Pipeline ParallelismProceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques10.1109/PACT.2011.9(22-32)Online publication date: 10-Oct-2011
  • (2010)An empirical characterization of stream programs and its implications for language and compiler designProceedings of the 19th international conference on Parallel architectures and compilation techniques10.1145/1854273.1854319(365-376)Online publication date: 11-Sep-2010
  • (2009)Universal rasterizer with edge equations and tile-scan triangle traversal algorithm for graphics processing unitsProceedings of the 2009 IEEE international conference on Multimedia and Expo10.5555/1698924.1699259(1358-1361)Online publication date: 28-Jun-2009
  • Show More Cited By

View Options

Login options

View options


View or Download as a PDF file.



View online with eReader.








Share this Publication link

Share on social media