research-article

Compiling high performance recursive filters

Authors:

Gaurav Chaurasia,

Jonathan Ragan-Kelley,

Sylvain Paris,

George Drettakis,

Fredo DurandAuthors Info & Claims

HPG '15: Proceedings of the 7th Conference on High-Performance Graphics

Pages 85 - 94

https://doi.org/10.1145/2790060.2790063

Published: 07 August 2015 Publication History

Get Access

Abstract

Infinite impulse response (IIR) or recursive filters, are essential for image processing because they turn expensive large-footprint convolutions into operations that have a constant cost per pixel regardless of kernel size. However, their recursive nature constrains the order in which pixels can be computed, severely limiting both parallelism within a filter and memory locality across multiple filters. Prior research has developed algorithms that can compute IIR filters with image tiles. Using a divide-and-recombine strategy inspired by parallel prefix sum, they expose greater parallelism and exploit producer-consumer locality in pipelines of IIR filters over multi-dimensional images. While the principles are simple, it is hard, given a recursive filter, to derive a corresponding tile-parallel algorithm, and even harder to implement and debug it.

We show that parallel and locality-aware implementations of IIR filter pipelines can be obtained through program transformations, which we mechanize through a domain-specific compiler. We show that the composition of a small set of transformations suffices to cover the space of possible strategies. We also demonstrate that the tiled implementations can be automatically scheduled in hardware-specific manners using a small set of generic heuristics. The programmer specifies the basic recursive filters, and the choice of transformation requires only a few lines of code. Our compiler then generates high-performance implementations that are an order of magnitude faster than standard GPU implementations, and outperform hand tuned tiled implementations of specialized algorithms which require orders of magnitude more programming effort---a few lines of code instead of a few thousand lines per pipeline.

Supplementary Material

ZIP File (p85-chaurasia.zip)

Download
2.45 MB

References

[1]

Blelloch, G. E. 1989. Scans as primitive parallel operations. IEEE Transactions on Computers 38, 11 (Nov), 1526--1538.

Abstract

Supplementary Material

References

Cited By

Index Terms

Recommendations

Designs of fractional delay filter, Nyquist filter, lowpass filter and diamond-shaped filter

A novel multi-dimensional zero-phase IIR notch filter with independently-tunable multiple notches

PolyMage: Automatic Optimization for Image Processing Pipelines

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Get Access

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations