research-article

On-the-fly Vertex Reuse for Massively-Parallel Software Geometry Processing

Authors:

Michael Kenzel,

Bernhard Kerbl,

Wolfgang Tatzgern,

Elena Ivanchenko,

Dieter Schmalstieg,

Markus SteinbergerAuthors Info & Claims

Proceedings of the ACM on Computer Graphics and Interactive Techniques, Volume 1, Issue 2

Article No.: 28, Pages 1 - 17

https://doi.org/10.1145/3233303

Published: 24 August 2018 Publication History

Abstract

Due to its flexibility, compute mode is becoming more and more attractive as a way to implement many of the algorithms part of a state-of-the-art rendering pipeline. A key problem commonly encountered in graphics applications is streaming vertex and geometry processing. In a typical triangle mesh, the same vertex is on average referenced six times. To avoid redundant computation during rendering, a post-transform cache is traditionally employed to reuse vertex processing results. However, such a vertex cache can generally not be implemented efficiently in software and does not scale well as parallelism increases. We explore alternative strategies for reusing per-vertex results on-the-fly during massively-parallel software geometry processing. Given an input stream divided into batches, we analyze the effectiveness of sorting, hashing, and intra-thread-group communication for identifying and exploiting local reuse potential. We design and present four vertex reuse strategies tailored to modern GPU architectures. We demonstrate that, in a variety of applications, these strategies not only achieve effective reuse of vertex processing results, but can boost performance by up to 2-3x compared to a naïve approach. Curiously, our experiments also show that our batch-based approaches exhibit behavior similar to the OpenGL implementation on current graphics hardware.

Supplementary Material

kenzel (kenzel.zip)

Supplemental movie, appendix, image and software files for, On-the-fly Vertex Reuse for Massively-Parallel Software Geometry Processing

Download
41.78 KB

References

[1]

Jatin Chhugani and Subodh Kumar. 2007. Geometry Engine Optimization: Cache Friendly Compressed Representation of Geometry. In Proceedings of the 2007 Symposium on Interactive 3D Graphics and Games (I3D '07). ACM, New York, NY, USA, 9--16.

Digital Library

[2]

Mike M. Chow. 1997. Optimized Geometry Compression for Real-time Rendering. In Proceedings of the 8th Conference on Visualization '97 (VIS '97). IEEE Computer Society Press, Los Alamitos, CA, USA, 347-ff. http://dl.acm.org/citation.cfm?id=266989.267103

Digital Library

[3]

Jonathan Cohen, Amitabh Varshney, Dinesh Manocha, Greg Turk, Hans Weber, Pankaj Agarwal, Frederick Brooks, and William Wright. 1996. Simplification Envelopes. In Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '96). ACM, New York, NY, USA, 119--128.

Digital Library

[4]

Michael Deering. 1995. Geometry Compression. In Proceedings of the 22Nd Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '95). ACM, New York, NY, USA, 13--20.

Digital Library

[5]

Matthew Eldridge, Homan Igehy, and Pat Hanrahan. 2000. Pomegranate: A Fully Scalable Graphics Architecture. In Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '00). ACM Press/Addison-Wesley Publishing Co., New York, NY, USA, 443--454.

Digital Library

[6]

Francine Evans, Steven Skiena, and Amitabh Varshney. 1996. Optimizing Triangle Strips for Fast Rendering. In Proceedings of the 7th Conference on Visualization '96 (VIS '96). IEEE Computer Society Press, Los Alamitos, CA, USA, 319--326. http://dl.acm.org/citation.cfm?id=244979.245626

Digital Library

[7]

Tom Forsyth. 2006. Linear-speed vertex cache optimisation.

[8]

Ulrich Haar and Sebastian Aaltonen. 2015. GPU-Driven Rendering Pipelines. SIGGRAPH 2015: Advances in Real-time Rendering in Games Talk.

[9]

Hugues Hoppe. 1999. Optimization of Mesh Locality for Transparent Vertex Caching. In Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '99). ACM Press/Addison-Wesley Publishing Co., New York, NY, USA, 269--276.

Digital Library

[10]

Michael Kenzel, Bernhard Kerbl, Dieter Schmalstieg, and Markus Steinberger. 2018. A High-Performance Software Graphics Pipeline Architecture for the GPU. ACM Trans. Graph. 37, 4, Article 140 (Nov. 2018), 15 pages.

Digital Library

[11]

Bernhard Kerbl, Michael Kenzel, Elena Ivanchenko, Dieter Schmalstieg, and Markus Steinberger. 2018. Revisiting The Vertex Cache: Understanding and Optimizing Vertex Processing on the modern GPU. Proc. ACM Comput. Graph. Interact. Tech. 1, 2, Article 29 (Aug. 2018), 16 pages.

Digital Library

[12]

Jon M Kleinberg. 2000. Navigation in a small world. Nature 406, 6798 (2000), 845.

[13]

Christoph Kubisch. 2015. Life of a triangle -- NVIDIA's logical pipeline. Technical Report. NVIDIA Corporation. https://developer.nvidia.com/content/life-triangle-nvidias-logical-pipeline

[14]

Christoph Kubisch and Pierre Boudier. 2016. GPU-Driven Rendering. GTC Talk.

[15]

Samuli Laine and Tero Karras. 2011. High-performance Software Rasterization on GPUs. In Proc. High Performance Graphics (HPG '11). 79--88.

Digital Library

[16]

G. Lin and T. P. Y. Yu. 2006. An improved vertex caching scheme for 3D mesh rendering. IEEE Transactions on Visualization and Computer Graphics 12, 4 (July 2006), 640--648.

Digital Library

[17]

Fang Liu, Meng-Cheng Huang, Xue-Hui Liu, and En-Hua Wu. 2010. FreePipe: A Programmable Parallel Rendering Architecture for Efficient Multi-fragment Effects. In Proc. I3D (I3D '10). 75--82.

Digital Library

[18]

Charles Loop. 1987. Smooth Subdivision Surfaces Based on Triangles. Ph.D. Dissertation.

[19]

Steven Molnar, Michael Cox, David Ellsworth, and Henry Fuchs. 1994. A Sorting Classification of Parallel Rendering. IEEE Comput. Graph. Appl. 14, 4 (July 1994), 23--32.

Digital Library

[20]

NVIDIA. 2016. CUDA C Programming Guide. NVIDIA Corporation.

[21]

Anjul Patney, Stanley Tzeng, Kerry A. Seitz, Jr., and John D. Owens. 2015. Piko: A Framework for Authoring Programmable Graphics Pipelines. ACM Trans. Graph. 34, 4, Article 147 (July 2015), 13 pages.

Digital Library

[22]

Karl Pearson. 1905. The problem of the random walk. Nature 72, 1867 (1905), 342.

[23]

Tim Purcell. 2010. Fast Tessellated Rendering on the Fermi GF100. In High Performance Graphics Conf., Hot 3D presentation. Guennadi Riguer. 2006. The Radeon X1000 Series Programming Guide.

[24]

Pedro V. Sander, Diego Nehab, and Joshua Barczak. 2007. Fast Triangle Reordering for Vertex Locality and Reduced Overdraw. ACM Trans. Graph. 26, 3, Article 89 (July 2007).

Digital Library

[25]

Martin Sattlecker and Markus Steinberger. 2015. Reyes Rendering on the GPU. In Proceedings of the 31st Spring Conference on Computer Graphics (SCCG '15). ACM, New York, NY, USA, 31--38.

Digital Library

[26]

Jeremy W. Sheaffer, David Luebke, and Kevin Skadron. 2004. A Flexible Simulation Framework for Graphics Architectures. In Proceedings of the ACM SIGGRAPH/EUROGRAPHICS Conference on Graphics Hardware (HWWS '04). ACM, New York, NY, USA, 85--94.

Digital Library

[27]

Markus Steinberger, Bernhard Kainz, Bernhard Kerbl, Stefan Hauswiesner, Michael Kenzel, and Dieter Schmalstieg. 2012. Softshell: Dynamic Scheduling on GPUs. ACM Trans. Graph. 31, 6, Article 161 (Nov. 2012), 11 pages.

Digital Library

[28]

Markus Steinberger, Michael Kenzel, Pedro Boechat, Bernhard Kerbl, Mark Dokter, and Dieter Schmalstieg. 2014. Whippletree: Task-based Scheduling of Dynamic Workloads on the GPU. ACM Trans. Graph. 33, 6, Article 228 (Nov. 2014), 11 pages.

Digital Library

[29]

Po-Han Wang, Chia-Lin Yang, Yen-Ming Chen, and Yu-Jung Cheng. 2011. Power Gating Strategies on GPUs. ACM Trans. Archit. Code Optim. 8, 3, Article 13 (Oct. 2011), 25 pages.

Digital Library

[30]

Graham Wihlidal. 2016. Optimizing the Graphics Pipeline with Compute. GDC Talk.

[31]

Kun Zhou, Xin Huang, Weiwei Xu, Baining Guo, and Heung-Yeung Shum. 2007. Direct Manipulation of Subdivision Surfaces on GPUs. ACM Trans. Graph. 26, 3, Article 91 (July 2007).

Digital Library

Cited By

Tine BYalamarthy KElsabbagh FHyesoon K(2021)Vortex: Extending the RISC-V ISA for GPGPU and 3D-GraphicsMICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3466752.3480128(754-766)Online publication date: 18-Oct-2021
https://dl.acm.org/doi/10.1145/3466752.3480128
Kerbl BKenzel MIvanchenko ESchmalstieg DSteinberger M(2018)Revisiting The Vertex CacheProceedings of the ACM on Computer Graphics and Interactive Techniques10.1145/32333021:2(1-16)Online publication date: 24-Aug-2018
https://dl.acm.org/doi/10.1145/3233302

Index Terms

On-the-fly Vertex Reuse for Massively-Parallel Software Geometry Processing
1. Computing methodologies
  1. Computer graphics
    1. Rendering
      1. Rasterization
  2. Parallel computing methodologies
    1. Parallel algorithms
      1. Massively parallel algorithms

Recommendations

Massively parallel differential evolution--pattern search optimization with graphics hardware acceleration: an investigation on bound constrained optimization problems

This paper presents a novel parallel Differential Evolution (DE) algorithm with local search for solving function optimization problems, utilizing graphics hardware acceleration. As a population-based meta-heuristic, DE was originally designed for ...
Massively LDPC Decoding on Multicore Architectures

Unlike usual VLSI approaches necessary for the computation of intensive Low-Density Parity-Check (LDPC) code decoders, this paper presents flexible software-based LDPC decoders. Algorithms and data structures suitable for parallel computing are proposed ...
On the Efficacy of a Fused CPU+GPU Processor (or APU) for Parallel Computing
SAAHPC '11: Proceedings of the 2011 Symposium on Application Accelerators in High-Performance Computing

The graphics processing unit (GPU) has made significant strides as an accelerator in parallel computing. However, because the GPU has resided out on PCIe as a discrete device, the performance of GPU applications can be bottlenecked by data transfers ...

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Computer Graphics and Interactive Techniques

Proceedings of the ACM on Computer Graphics and Interactive Techniques Volume 1, Issue 2

August 2018

223 pages

EISSN:2577-6193

DOI:10.1145/3273023

Issue’s Table of Contents

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 August 2018

Published in PACMCGIT Volume 1, Issue 2

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
179
Total Downloads

Downloads (Last 12 months)26
Downloads (Last 6 weeks)2

Reflects downloads up to 27 Jul 2024

Other Metrics

View Author Metrics

Citations

Cited By

Tine BYalamarthy KElsabbagh FHyesoon K(2021)Vortex: Extending the RISC-V ISA for GPGPU and 3D-GraphicsMICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3466752.3480128(754-766)Online publication date: 18-Oct-2021
https://dl.acm.org/doi/10.1145/3466752.3480128
Kerbl BKenzel MIvanchenko ESchmalstieg DSteinberger M(2018)Revisiting The Vertex CacheProceedings of the ACM on Computer Graphics and Interactive Techniques10.1145/32333021:2(1-16)Online publication date: 24-Aug-2018
https://dl.acm.org/doi/10.1145/3233302

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents