Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

HIPA<sup>cc</sup>: A Domain-Specific Language and Compiler for Image Processing

Published: 01 January 2016 Publication History

Abstract

Domain-specific languages (DSLs) provide high-level and domain-specific abstractions that allow expressive and concise algorithm descriptions. Since the description in a DSL hides also the properties of the target hardware, DSLs are a promising path to target different parallel and heterogeneous hardware from the same algorithm description. In theory, the DSL description can capture all characteristics of the algorithm that are required to generate highly efficient parallel implementations. However, most frameworks do not make use of this knowledge and the performance cannot reach that of optimized library implementations. In this article, we present the HIPA<sup>cc</sup> framework, a DSL and source-to-source compiler for image processing. We show that domain knowledge can be captured in the language and that this knowledge enables us to generate tailored implementations for a given target architecture. Back ends for CUDA, OpenCL, and Renderscript allow us to target discrete graphics processing units (GPUs) as well as mobile, embedded GPUs. Exploiting the captured domain knowledge, we can generate specialized algorithm variants that reach the maximal achievable performance due to the peak memory bandwidth. These implementations outperform state-of-the-art domain-specific languages and libraries significantly.

References

[1]
P. Du, R. Weber, P. Luszczek, S. Tomov, G. Peterson, and J. Dongarra, “ From CUDA to OpenCL: Towards a performance-portable solution for multi-platform GPU programming, ” Parallel Comput., vol. 38, no. 8, pp. 391–407, 2011.
[2]
J. Ragan-Kelley, A. Adams, S. Paris, M. Levoy, S. Amarasinghe, and F. Durand, “Decoupling algorithms from schedules for easy optimization of image processing pipelines,” ACM Trans. Graph., vol. 31, no. 4, p. 32, Jul. 2012.
[3]
T. Lepley, P. Paulin, and E. Flamand, “A novel compilation approach for image processing graphs on a many-core platform with explicitly managed memory,” in Proc. Int. Conf. Compilers, Archit. Synthesis Embedded Syst., Sep. 2013, pp. 6:1–6:10.
[4]
Z. DeVito, N. Joubert, F. Palacios, S. Oakley, M. Medina, M. Barrientos, E. Elsen, F. Ham, A. Aiken, K. Duraisamy, E. Darve, J. Alonso, and P. Hanrahan, “Liszt: A domain specific language for building portable mesh-based PDE solvers,” in Proc. Int. Conf. High Perform. Comput., Netw., Storage Anal., Nov. 2011, pp. 9:1–9:12.
[5]
A. K. Sujeeth, H. Lee, K. J. Brown, T. Rompf, H. Chafi, M. Wu, A. R. Atreya, M. Odersky, and K. Olukotun, “OptiML: An implicitly parallel domain-specific language for machine learning,” in Proc. 28th Int. Conf. Mach. Learn., Jun. 2011, pp. 609–616.
[6]
R. Membarth, F. Hannig, J. Teich, M. Körner, and W. Eckert, “Generating device-specific GPU code for local operators in medical imaging,” in Proc. 26th IEEE Int. Parallel Distrib. Process. Symp., 2012, pp. 569–581.
[7]
R. Membarth, “Code generation for GPU accelerators from a domain-specific language for medical imaging,” Ph.d. dissertation, Hardware/Softw. Co-Design, Dept. Comput. Sci., Univ. Erlangen-Nuremberg, Germany, verlag Dr. Hut, Munich, Germany.
[8]
R. Membarth, O. Reiche, F. Hannig, and J. Teich, “Code generation for embedded heterogeneous architectures on Android,” in Proc. Conf. Des., Autom. Test Eur., 2014, pp. 86:1–86:6.
[9]
S. Williams, A. Waterman, and D. Patterson, “ Roofline: An insightful visual performance model for multicore architectures,” Commun. ACM, vol. 52, no. 4, pp. 65–76, Apr. 2009.
[10]
M. H. Halstead, Elements of Software Science, (ser. Operating and Programming Systems). New York, NY, USA: Elsevier, 1977.
[11]
I. N. Bankman, Handbook of Medical Image Processing and Analysis, vol. 2. New York, NY, USA: Academic, 2008.
[12]
J. C. Russ, The Image Processing Handbook, vol. 5. Boca Raton, FL, USA: CRC Press, 2006.
[13]
R. Klette and P. Zamperoni, Handbook of Image Processing Operators, vol. 1. Hoboken, NJ, USA : Wiley, 1996.
[14]
P. Burt and E. Adelson, “The Laplacian pyramid as a compact image code,” IEEE Trans. Commun., vol. 31, no. 4, pp. 532–540, Apr. 1983.
[15]
J. Reinders, Intel Threading Building Blocks: Outfitting C++ for Multi-Core Processor Parallelism. Sebastopol, CA, USA: O’Reilly Media, 2007.
[16]
C. Tomasi and R. Manduchi, “Bilateral filtering for gray and color images,” in Proc. 6th Int. Conf. Comput. Vis., Jan. 1998, pp. 839–846.
[17]
G. E. Blelloch, “Prefix sums and their applications,” in Synthesis of Parallel Algorithms, J. H. Reif, Ed. San Mateo, CA, USA : Morgan Kaufmann, 1993, ch. 1, pp. 35 –60.
[18]
A. V. Aho, R. Sethi, and J. D. Ullman, Compilers: Principles, Techniques, and Tools, vol. 2. Reading, MA, USA: Addison-Wesley, 1986.
[19]
N. Wirth, “Program development by stepwise refinement,” Commun. ACM, vol. 14, no. 4, pp. 221–227, Apr. 1971.
[20]
R. Karrenberg and S. Hack, “Whole-function vectorization,” in Proc. 9th Annu. IEEE/ACM Int. Symp. Code Generation Optim., 2011, pp. 141–150.
[21]
M. Wolfe, High Performance Compilers for Parallel Computing. Reading, MA, USA: Addison-Wesley, 1996.
[22]
RapidMind, RapidMind Development Platform Documentation. Waterloo, Ontario, Canada: RapidMind Inc., 2009.
[23]
C. H. González and B. B. Fraguela, “A generic algorithm template for divide-and-conquer in multicore systems,” in Proc. 12th Int. Conf. High Perform. Comput. Commun., Sep. 2010, pp. 79–88.
[24]
J. M. Stroud, “The fine structure of psychological time,” in Inf. Theory in Psychology. New York, NY, US: Free Press, 1956.
[25]
R. D. Gordon and M. H. Halstead, “An experiment comparing Fortran programming times with the software physics hypothesis,” in Proc. Nat. Comput. Conf.; Amer. Federation Inf. Process. Soc., Jun. 1976, pp. 935–937.
[26]
C. Harris and M. Stephens, “A combined corner and edge detector,” in Proc. 4th Alvey Vis. Conf., 1988, pp. 147–151.
[27]
F. Stein, “Efficient computation of optical flow using the census transform,” in Proc. DAGM Pattern Recognit., 2004, pp. 79 –86.
[28]
P. Feautrier and C. Lengauer, “Polyhedron model,” in Encyclopedia of Parallel Computing. New York, NY, USA: Springer, 2011, pp. 1581–1592.
[29]
H. Chafi, Z. DeVito, A. Moors, T. Rompf, A. K. Sujeeth, P. Hanrahan, M. Odersky, and K. Olukotun, “Language virtualization for heterogeneous parallel computing,” in Proc. ACM Int. Conf. Object Oriented Programm. Syst. Lang. Appl., Oct. 2010, pp. 835–847.
[30]
H. Chafi, A. K. Sujeeth, K. J. Brown, H. Lee, A. R. Atreya, and K. Olukotun, “A domain-specific approach to heterogeneous parallelism,” in Proc. 16th Annu. Symp. Principles Practice Parallel Programm., Feb. 2011, pp. 35–46.
[31]
L. Howes, A. Lokhmotov, A. Donaldson, and P. H. J. Kelly, “Deriving efficient data movement from decoupled access/execute specifications,” in Proc. 4th Int. Conf. High-Perform. Embedded Archit. Compilers, 2009, pp. 168–182.
[32]
J. L. Cornwall, L. Howes, P. H. J. Kelly, P. Parsonage, and B. Nicoletti, “High-performance SIMT code generation in an active visual effects library,” in Proc. 6th ACM Conf. Comput. Frontiers, 2009, pp. 175–184.
[33]
M. McCool, S. Du Toit, T. Popa, B. Chan, and K. Moule, “Shader algebra,” ACM Trans. Graph. , vol. 23, no. 3, pp. 787–795, 2004.
[34]
C. J. Newburn, B. So, Z. Liu, M. McCool, A. Ghuloum, S. Du Toit, Z. G. Wang, Z. H. Du, Y. Chen, G. Wu, P. Guo, Z. Liu, and D. Zhang, “Intel’s array building blocks: A retargetable, dynamic compiler and embedded language, ” in Proc. 9th Annu. IEEE/ACM Int. Symp. Code Generation Optim., Apr. 2011, pp. 224–235.
[35]
L. Howes, A. Lokhmotov, A. F. Donaldson, and P. H. J. Kelly, “Towards metaprogramming for parallel systems on a chip,” in Proc. 3rd Workshop Highly Parallel Process. Chip, 2009, pp. 36–45.
[36]
R. Membarth, A. Lokhmotov, and J. Teich, “Generating GPU code from a high-level representation for image processing kernels,” in Proc. 5th Workshop Highly Parallel Process. Chip, 2011, pp. 270–280.
[37]
O. Reiche, M. Schmid, F. Hannig, R. Membarth, and J. Teich, “Code generation from a domain-specific language for C-based HLS of hardware accelerators,” in Proc. Int. Conf. Hardware/Softw. Codesign Syst. Synthesis , 2014, pp. 17:1–17:10.

Cited By

View all
  • (2024)Domain specific language for finite element modeling and simulationAdvances in Engineering Software10.1016/j.advengsoft.2024.103666193:COnline publication date: 1-Jul-2024
  • (2023)GPotion: An embedded DSL for GPU programming in ElixirProceedings of the XXVII Brazilian Symposium on Programming Languages10.1145/3624309.3624314(1-8)Online publication date: 25-Sep-2023
  • (2023)Software Architecture in Practice: Challenges and OpportunitiesProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616367(1457-1469)Online publication date: 30-Nov-2023
  • Show More Cited By

Index Terms

  1. HIPAcc: A Domain-Specific Language and Compiler for Image Processing
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Information & Contributors

          Information

          Published In

          cover image IEEE Transactions on Parallel and Distributed Systems
          IEEE Transactions on Parallel and Distributed Systems  Volume 27, Issue 1
          Jan. 2016
          304 pages

          Publisher

          IEEE Press

          Publication History

          Published: 01 January 2016

          Qualifiers

          • Research-article

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)0
          • Downloads (Last 6 weeks)0
          Reflects downloads up to 12 Feb 2025

          Other Metrics

          Citations

          Cited By

          View all
          • (2024)Domain specific language for finite element modeling and simulationAdvances in Engineering Software10.1016/j.advengsoft.2024.103666193:COnline publication date: 1-Jul-2024
          • (2023)GPotion: An embedded DSL for GPU programming in ElixirProceedings of the XXVII Brazilian Symposium on Programming Languages10.1145/3624309.3624314(1-8)Online publication date: 25-Sep-2023
          • (2023)Software Architecture in Practice: Challenges and OpportunitiesProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616367(1457-1469)Online publication date: 30-Nov-2023
          • (2023)Let Coarse-Grained Resources Be Shared: Mapping Entire Neural Networks on FPGAsACM Transactions on Embedded Computing Systems10.1145/360910922:5s(1-23)Online publication date: 31-Oct-2023
          • (2023)A GPU optimization workflow for real-time execution of ultra-high frame rate computer vision applicationsJournal of Real-Time Image Processing10.1007/s11554-023-01384-721:1Online publication date: 26-Nov-2023
          • (2023)The Good, the Bad and the Ugly: Practices and Perspectives on Hardware Acceleration for Embedded Image ProcessingJournal of Signal Processing Systems10.1007/s11265-023-01885-595:10(1181-1201)Online publication date: 1-Oct-2023
          • (2023)ArcvaVX: OpenVX Framework for Adaptive Reconfigurable Computer Vision ArchitecturesApplied Reconfigurable Computing. Architectures, Tools, and Applications10.1007/978-3-031-42921-7_7(97-112)Online publication date: 27-Sep-2023
          • (2022)The Theory and Method of Data Acquisition of Mixed Traffic Popular People and Nonmotor Vehicles Based on Image ProcessingMobile Information Systems10.1155/2022/96991622022Online publication date: 1-Jan-2022
          • (2022)Informatization Teaching Mode of Vision Sensor Digital Image and Fusion Association Rule Mining AlgorithmSecurity and Communication Networks10.1155/2022/75022392022Online publication date: 1-Jan-2022
          • (2022)Digital Media Design for Dynamic Gesture Interaction with Image ProcessingJournal of Electrical and Computer Engineering10.1155/2022/40566222022Online publication date: 1-Jan-2022
          • Show More Cited By

          View Options

          View options

          Figures

          Tables

          Media

          Share

          Share

          Share this Publication link

          Share on social media