Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
Polygon rendering for interactive visualization on multicomputers
Publisher:
  • The University of North Carolina at Chapel Hill
ISBN:978-0-591-23545-6
Order Number:AAI9715692
Pages:
370
Reflects downloads up to 26 Sep 2024Bibliometrics
Skip Abstract Section
Abstract

This dissertation identifies a class of parallel polygon rendering algorithms suitable for interactive use on multicomputers, and presents a methodology for designing efficient algorithms within that class. The methodology was used to design a new polygon rendering algorithm that uses the frame-to-frame coherence of the screen image to evenly partition the rasterization at reasonable cost. An implementation of the algorithm on the Intel Touchstone Delta at Caltech, the largest multicomputer at the time, renders 3.1 million triangles per second. The rate was measured using a 806,640 triangle model and 512 i860 processors, and includes back-facing triangles. A similar algorithm is used in Pixel-Planes 5, a system that has specialized rasterization processors, and which, when introduced, had a benchmark score for the SPEC Graphics Performance Characterization Group "head" benchmark that was nearly four times faster than commercial workstations. The algorithm design methodology also identified significant performance improvements for Pixel-Planes 5.

All fully parallel polygon rendering algorithms have a sorting step to redistribute primitives or fragments according to their screen location. The algorithm class mentioned above is one of four classes of parallel rendering algorithms identified; the classes are differentiated by the type of data that is communicated between processors. The identified algorithm class, called sort-middle, sorts screen-space primitives between the transformation and rasterization.

The design methodology uses simulations and performance models to help make the design decisions. The resulting algorithm partitions the screen during rasterization into adaptively sized regions with an average of four regions per processor. The region boundaries are only changed when necessary: when one region is the rasterization bottleneck. On smaller systems, the algorithm balances the loads by assigning regions to processors once per frame, using the assignments made during one frame in the next. However, when 128 or more processors are used at high frame rates, the load balancing may take too long, and so static load balancing should be used. Additionally, a new all-to-all communication method improves the algorithm's performance on systems with more than 64 processors.

Contributors
  • The University of North Carolina at Chapel Hill
  • The University of North Carolina at Chapel Hill

Recommendations