Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Efficient parallel stencil convolution in Haskell

Published: 22 September 2011 Publication History
  • Get Citation Alerts
  • Abstract

    Stencil convolution is a fundamental building block of many scientific and image processing algorithms. We present a declarative approach to writing such convolutions in Haskell that is both efficient at runtime and implicitly parallel. To achieve this we extend our prior work on the Repa array library with two new features: partitioned and cursored arrays. Combined with careful management of the interaction between GHC and its back-end code generator LLVM, we achieve performance comparable to the standard OpenCV library.

    Supplementary Material

    JPG File (_talk6.jpg)
    MP4 File (_talk6.mp4)

    References

    [1]
    S. V. Adve, S. Heumann, R. Komuravelli, J. Overbey, P. Simmons, H. Sung, and M. Vakilian. A type and effect system for Deterministic Parallel Java. In In Proc. Intl. Conf. on Object-Oriented Programming, Systems, Languages, and Applications, 2009.
    [2]
    B. Alpern, M. N. Wegman, and F. K. Zadeck. Detecting equality of variables in programs. In Proc. of the 15th Symposium on Principles of Programming Languages, pages 1--11, 1988.
    [3]
    R. Barrett, P. Roth, and S. Poole. Finite difference stencils implemented using Chapel. Technical report, Oak Ridge National Laboratory, 2007.
    [4]
    M. Bolingbroke and S. Peyton Jones. Supercompilation by evaluation. In Proc. of the third ACM Haskell Symposium, pages 135--146. ACM, 2010.
    [5]
    G. Bradski and A. Kaehler. Learning OpenCV: Computer Vision with the OpenCV Library. O'Reilly Media, 2008.
    [6]
    J. Canny. Finding edges and lines in images. Technical report, Massachusetts Institute of Technology, Cambridge, MA, USA, 1983.
    [7]
    S. Carr, C. Ding, and P. Sweany. Improving software pipelining with unroll-and-jam. In Proc. of the 29th Hawaii International Conference on System Sciences. IEEE Computer Society, 1996.
    [8]
    M. M. Chakravarty, G. Keller, S. Lee, T. L. McDonell, and V. Grover. Accelerating Haskell array codes with multicore GPUs. In Proc. of the sixth workshop on Declarative Aspects of Multicore Programming, pages 3--14. ACM, 2011.
    [9]
    B. L. Chamberlain, S.-E. Choi, E. C. Lewis, C. Lin, L. Snyder, and W. D. Weathersby. ZPL: A machine independent programming language for parallel computers. IEEE Transactions on Software Engineering, 26: 197--211, 2000.
    [10]
    D. Coutts, R. Leshchinskiy, and D. Stewart. Stream fusion: from lists to streams to nothing at all. In Proc. of the 12th ACM SIGPLAN International Conference on Functional programming, pages 315--326. ACM, 2007.
    [11]
    K. Datta, M. Murphy, V. Volkov, S. Williams, J. Carter, L. Oliker, D. Patterson, J. Shalf, and K. Yelick. Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures. In Proc, of the ACM/IEEE Conference on Supercomputing, pages 4:1--4:12. IEEE Press, 2008.
    [12]
    D. G. Feitelson and L. Rudolph. Gang scheduling performance benefits for fine-grain synchronization. Journal of Parallel and Distributed Computing, 16: 306--318, 1992.
    [13]
    P. N. Hilfinger, D. Bonachea, D. Gay, S. Graham, B. Liblit, G. Pike, and K. Yelick. Titanium language reference manual. Technical report, Berkeley, CA, USA, 2001.
    [14]
    C. S. Ierotheou, S. P. Johnson, M. Cross, and P. F. Leggett. Computer aided parallelisation tools (CAPTools) - conceptual overview and performance on the parallelisation of structured mesh codes. Parallel Comput., 22: 163--195, February 1996.
    [15]
    G. Keller, M. M. Chakravarty, R. Leshchinskiy, S. Peyton Jones, and B. Lippmeier. Regular, Shape-polymorphic, Parallel Arrays in Haskell. In Proc. of the 15th ACM SIGPLAN International Conference on Functional Programming, pages 261--272. ACM, 2010.
    [16]
    S. Krishnamoorthy, M. Baskaran, U. Bondhugula, J. Ramanujam, A. Rountev, and P. Sadayappan. Effective automatic parallelization of stencil computations. In Proc. of the 2007 ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 235--244. ACM, 2007.
    [17]
    J. Launchbury and S. L. Peyton Jones. Lazy functional state threads. In Proc. of the ACM SIGPLAN 1994 conference on Programming Language Design and Implementation, pages 24--35. ACM, 1994.
    [18]
    M. Lesniak. PASTHA: parallelizing stencil calculations in Haskell. In Proc. of the 5th ACM SIGPLAN workshop on Declarative Aspects of Multicore Programming, pages 5--14. ACM, 2010.
    [19]
    N. Mitchell. Rethinking supercompilation. In Proceedings of the 15th ACM SIGPLAN International Conference on Functional Programming, pages 309--320. ACM, 2010.
    [20]
    R. W. Numrich. The computational energy spectrum of a program as it executes. The Journal of Supercomputing, 52 (2): 119--134, 2010.
    [21]
    L. O'Gorman, M. J. Sammon, and M. Seul. Practical Algorithms for Image Analysis. Cambridge University Press, 2nd edition, 2008.
    [22]
    D. A. Orchard, M. Bolingbroke, and A. Mycroft. Ypnos: Declarative, Parallel Structured Grid Programming. In Proc. of the 5th ACM SIGPLAN workshop on Declarative Aspects of Multicore Programming, pages 15--24. ACM, 2010.
    [23]
    S. Peyton Jones, A. Tolmach, and T. Hoare. Playing by the rules: Rewriting as a practical optimisation technique in GHC. In Proc. of the Haskell Workshop, 2001.
    [24]
    Repa. The Repa Home Page, Mar. 2011. http://trac.haskell.org/repa.
    [25]
    B. K. Rosen, M. N. Wegman, and F. K. Zadeck. Global value numbers and redundant computations. In Proc. of the 15th Symposium on Principles of Programming Languages. ACM, 1988.
    [26]
    S.-B. Scholz. Single assignment C -- efficient support for high-level array operations in a functional setting. Journal of Functional Programming, 13 (6): 1005--1059, 2003.
    [27]
    D. A. Terei and M. M. Chakravarty. An LLVM backend for GHC. In Proc. of the third ACM Symposium on Haskell, pages 109--120. ACM, 2010.

    Cited By

    View all

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 46, Issue 12
    Haskell '11
    December 2011
    129 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/2096148
    Issue’s Table of Contents
    • cover image ACM Conferences
      Haskell '11: Proceedings of the 4th ACM symposium on Haskell
      September 2011
      136 pages
      ISBN:9781450308601
      DOI:10.1145/2034675
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 22 September 2011
    Published in SIGPLAN Volume 46, Issue 12

    Check for updates

    Author Tags

    1. arrays
    2. data parallelism
    3. haskell

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 11 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2013)An EDSL approach to high performance Haskell programmingACM SIGPLAN Notices10.1145/2578854.250378948:12(1-12)Online publication date: 23-Sep-2013
    • (2013)An EDSL approach to high performance Haskell programmingProceedings of the 2013 ACM SIGPLAN symposium on Haskell10.1145/2503778.2503789(1-12)Online publication date: 23-Sep-2013
    • (2013)Parallelization of Shallow-water Equations with the Algorithmic Skeleton Library SkelGISProcedia Computer Science10.1016/j.procs.2013.05.22318(591-600)Online publication date: 2013
    • (2017)Exploiting vector instructions with generalized stream fusionCommunications of the ACM10.1145/306059760:5(83-91)Online publication date: 24-Apr-2017
    • (2016)Why So Many?Proceedings of the 1st International Workshop on Real World Domain Specific Languages10.1145/2889420.2893172(1-2)Online publication date: 12-Mar-2016
    • (2014)Native offload of Haskell repa programs to integrated GPUsProceedings of the 3rd ACM SIGPLAN workshop on Functional high-performance computing10.1145/2636228.2636236(87-97)Online publication date: 3-Sep-2014
    • (2014)Defunctionalizing push arraysProceedings of the 3rd ACM SIGPLAN workshop on Functional high-performance computing10.1145/2636228.2636231(43-52)Online publication date: 3-Sep-2014
    • (2014)Array Operators Using Multiple DispatchProceedings of ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming10.1145/2627373.2627383(56-61)Online publication date: 9-Jun-2014
    • (2013)Measuring the Haskell GapProceedings of the 25th symposium on Implementation and Application of Functional Languages10.1145/2620678.2620685(61-72)Online publication date: 28-Aug-2013
    • (2013)Automatic SIMD vectorization for HaskellACM SIGPLAN Notices10.1145/2544174.250060548:9(25-36)Online publication date: 25-Sep-2013
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media