Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2103736.2103740acmconferencesArticle/Chapter ViewAbstractPublication PagespoplConference Proceedingsconference-collections
research-article

Expressive array constructs in an embedded GPU kernel programming language

Published: 28 January 2012 Publication History

Abstract

Graphics Processing Units (GPUs) are powerful computing devices that with the advent of CUDA/OpenCL are becomming useful for general purpose computations. Obsidian is an embedded domain specific language that generates CUDA kernels from functional descriptions. A symbolic array construction allows us to guarantee that intermediate arrays are fused away. However, the current array construction has some drawbacks; in particular, arrays cannot be combined efficiently. We add a new type of push arrays to the existing Obsidian system in order to solve this problem. The two array types complement each other, and enable the definition of combinators that both take apart and combine arrays, and that result in efficient generated code. This extension to Obsidian is demonstrated on a sequence of sorting kernels, with good results. The case study also illustrates the use of combinators for expressing the structure of parallel algorithms. The work presented is preliminary, and the combinators presented must be generalised. However, the raw speed of the generated kernels bodes well.

References

[1]
E. Axelsson, K. Claessen, M. Sheeran, J. Svenningsson, D. Engdal, and A. Persson. The Design and Implementation of Feldspar an Embedded Language for Digital Signal Processing. In Proceedings of the 22nd international conference on Implementation and application of functional languages, IFL'10, pages 121--136, Berlin, Heidelberg, 2011. Springer-Verlag.
[2]
K. E. Batcher. Sorting Networks and Their Applications. In Proc. AFIPS Spring Joint Computer Conference 32, pages 307--314, 1968.
[3]
M. Billeter, O. Olsson, and U. Assarsson. Efficient stream compaction on wide SIMD many-core architectures. In Proc. Conf. on High Performance Graphics, HPG'09, pages 159--166. ACM, 2009.
[4]
M. M. Chakravarty, G. Keller, S. Lee, T. L. McDonell, and V. Grover. Accelerating Haskell array codes with multicore GPUs. In Proc. sixth workshop on Declarative Aspects of Multicore Programming, DAMP'11, pages 3--14. ACM, 2011.
[5]
K. Claessen. A poor man's concurrency monad. J. Funct. Program., 9 (3): 313--323, 1999.
[6]
K. Claessen, M. Sheeran, and S. Singh. The Design and Verification of a Sorter Core. In Proc. Int. Conf. on Correct Hardware Design and Verification Methods (CHARME), volume 2144 of phSpringer LNCS, pages 355--369, 2001.
[7]
M. Dowd, Y. Perl, and M. S. Larry Rudolph. The Periodic Balanced Sorting Network. Journal of the ACM, 36: 738--757, 1989.
[8]
C. Elliott. Functional Images. In The Fun of Programming, "Cornerstones of Computing" series. Palgrave, Mar. 2003.
[9]
C. Elliott, S. Finne, and O. de Moor. Compiling embedded languages. Journal of Functional Programming, 13 (2), 2003.
[10]
Joel Svensson. Obsidian: GPU Kernel Programming in Haskell. Technical Report 77L, Computer Science and Enginering, Chalmers University of Technology, Gothenburg, 2011. Thesis for the degree of Licentiate of Philosophy.
[11]
G. Keller, M. M. Chakravarty, R. Leshchinskiy, S. Peyton Jones, and B. Lippmeier. Regular, shape-polymorphic, parallel arrays in Haskell. In Proceedings of the 15th ACM SIGPLAN international conference on Functional programming, ICFP'10, pages 261--272, New York, NY, USA, 2010. ACM.
[12]
B. Larsen. Simple optimizations for an applicative array language for graphics processors. In phProc. sixth workshop on Declarative Aspects of Multicore Programming, DAMP '11, pages 25--34. ACM, 2011.
[13]
G. Mainland and G. Morrisett. Nikola: embedding compiled GPU functions in Haskell. In Proc. third ACM Symposium on Haskell. ACM, 2010.
[14]
M. Sheeran. Describing Butterfly Networks in Ruby. In Functional Programming, pages 182--205. Springer Workshops in Computing, 1989.

Cited By

View all
  • (2023)Efficient GPU Implementation of Affine Index Permutations on ArraysProceedings of the 11th ACM SIGPLAN International Workshop on Functional High-Performance and Numerical Computing10.1145/3609024.3609411(15-28)Online publication date: 30-Aug-2023
  • (2023)Polymorphic Types with Polynomial SizesProceedings of the 9th ACM SIGPLAN International Workshop on Libraries, Languages and Compilers for Array Programming10.1145/3589246.3595372(36-49)Online publication date: 6-Jun-2023
  • (2022)Compiling a functional array language with non-semantic memory informationProceedings of the 34th Symposium on Implementation and Application of Functional Languages10.1145/3587216.3587218(1-13)Online publication date: 31-Aug-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
DAMP '12: Proceedings of the 7th workshop on Declarative aspects and applications of multicore programming
January 2012
62 pages
ISBN:9781450311175
DOI:10.1145/2103736
  • General Chair:
  • Umut Acar,
  • Program Chair:
  • Vítor Santos Costa
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 January 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. arrays
  2. data parallelism
  3. embedded domain specific language
  4. general purpose gpu programming
  5. haskell

Qualifiers

  • Research-article

Conference

POPL '12
Sponsor:

Upcoming Conference

POPL '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)11
  • Downloads (Last 6 weeks)1
Reflects downloads up to 15 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Efficient GPU Implementation of Affine Index Permutations on ArraysProceedings of the 11th ACM SIGPLAN International Workshop on Functional High-Performance and Numerical Computing10.1145/3609024.3609411(15-28)Online publication date: 30-Aug-2023
  • (2023)Polymorphic Types with Polynomial SizesProceedings of the 9th ACM SIGPLAN International Workshop on Libraries, Languages and Compilers for Array Programming10.1145/3589246.3595372(36-49)Online publication date: 6-Jun-2023
  • (2022)Compiling a functional array language with non-semantic memory informationProceedings of the 34th Symposium on Implementation and Application of Functional Languages10.1145/3587216.3587218(1-13)Online publication date: 31-Aug-2022
  • (2022)Deep Fusion for Efficient Nested Recursive ComputationsProceedings of the 21st ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences10.1145/3564719.3568698(33-44)Online publication date: 29-Nov-2022
  • (2022)Functional collection programming with semi-ring dictionariesProceedings of the ACM on Programming Languages10.1145/35273336:OOPSLA1(1-33)Online publication date: 29-Apr-2022
  • (2020)Multi-layer optimizations for end-to-end data analyticsProceedings of the 18th ACM/IEEE International Symposium on Code Generation and Optimization10.1145/3368826.3377923(145-157)Online publication date: 22-Feb-2020
  • (2019)A functional approach to accelerating Monte Carlo based american option pricingProceedings of the 31st Symposium on Implementation and Application of Functional Languages10.1145/3412932.3412937(1-12)Online publication date: 25-Sep-2019
  • (2019)Efficient differentiable programming in a functional array-processing languageProceedings of the ACM on Programming Languages10.1145/33417013:ICFP(1-30)Online publication date: 26-Jul-2019
  • (2019)Incremental flattening for nested data parallelismProceedings of the 24th Symposium on Principles and Practice of Parallel Programming10.1145/3293883.3295707(53-67)Online publication date: 16-Feb-2019
  • (2019)High-Performance Defunctionalisation in FutharkTrends in Functional Programming10.1007/978-3-030-18506-0_7(136-156)Online publication date: 24-Apr-2019
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media