research-article

Work efficient higher-order vectorisation

Authors:

Manuel M.T. Chakravarty,

Gabriele Keller,

Roman Leshchinskiy,

Simon Peyton JonesAuthors Info & Claims

ICFP '12: Proceedings of the 17th ACM SIGPLAN international conference on Functional programming

Pages 259 - 270

https://doi.org/10.1145/2364527.2364564

Published: 09 September 2012 Publication History

Abstract

Existing approaches to higher-order vectorisation, also known as flattening nested data parallelism, do not preserve the asymptotic work complexity of the source program. Straightforward examples, such as sparse matrix-vector multiplication, can suffer a severe blow-up in both time and space, which limits the practicality of this method. We discuss why this problem arises, identify the mis-handling of index space transforms as the root cause, and present a solution using a refined representation of nested arrays. We have implemented this solution in Data Parallel Haskell (DPH) and present benchmarks showing that realistic programs, which used to suffer the blow-up, now have the correct asymptotic work complexity. In some cases, the asymptotic complexity of the vectorised program is even better than the original.

References

[1]

G. Blelloch and G.W. Sabot. Compiling collection-oriented languages onto massively parallel computers. Journal of Parallel and Distributed Computing, 8:119--134, 1990.

Digital Library

[2]

G. E. Blelloch. Vector models for data-parallel computing. MIT Press, 1990.

Digital Library

[3]

G. E. Blelloch. NESL: A nested data-parallel language (version 3.1). Technical report, Carnegie Mellon University, 1995.

[4]

G. E. Blelloch and J. Greiner. A provable time and space efficient implementation of NESL. In ICFP 1996: International Conference on Functional Programming, pages 213--225, 1996.

Digital Library

[5]

M. M. T. Chakravarty, G. Keller, S. Peyton Jones, and S. Marlow. Associated types with class. In POPL 2005: Principles of Programming Languages, pages 1--13. ACM Press, 2005.

Digital Library

[6]

M. M. T. Chakravarty, R. Leshchinskiy, S. Peyton Jones, G. Keller, and S. Marlow. Data Parallel Haskell: a status report. In DAMP 2007: Declarative Aspects of Multicore Programming. ACM Press, 2007.

Digital Library

[7]

D. Coutts, R. Leshchinskiy, and D. Stewart. Stream fusion: from lists to streams to nothing at all. In ICFP 2007: International Conference on Functional Programming, 2007.

Digital Library

[8]

M. Fluet, M. Rainey, and J. Reppy. A scheduling framework for general-purpose parallel languages. In ICFP 2008: International Conference on Functional Programming, pages 241--252. ACM, 2008.

Digital Library

[9]

A. Ghuloum, T. Smith, G.Wu, X. Zhou, J. Fang, P. Guo, B. So, M. Rajagopalan, Y. Chen, and B. Chen. Future-proof data parallel algorithms and software on Intel multi-core architecture. Intel Technology Journal, November 2007.

[10]

J. Hill, K. M. Clarke, and R. Bornat. Vectorising a non-strict data-parallel functional language, 1994.

[11]

R. Leshchincskiy. Higher-Order Nested Data Parallelism. PhD thesis, Technische Universität Berlin, 2006.

[12]

R. Leshchinskiy, M. M. T. Chakravarty, and G. Keller. Higher order flattening. In ICCS 2006: International Conference on Computational Science, volume 3992, pages 920--928. Springer, 2006.

Digital Library

[13]

B. Lippmeier, M. M. T. Chakravarty, G. Keller, R. Leshchinskiy, and S. P. Jones. Work efficient higher-order vectorisation (unabridged). Technical Report UNSW-CSE-TR-201208, University of New South Wales, 2012.

[14]

D.W. Palmer, J. F. Prins, S. Chatterjee, and R. E. Faith. Piecewise execution of nested data-parallel programs. In Languages and Compilers for Parallel Computing, volume 1033 of Lecture Notes in Computer Science, pages 346--361. Springer-Verlag, 1995.

Digital Library

[15]

D. W. Palmer, J. F. Prins, and S. Westfold. Work-efficient nested data-parallelism. In Proc. of the 5th Symposium on the Frontiers of Massively Parallel Processing, pages 186--193. IEEE, 1995.

Digital Library

[16]

S. Peyton Jones,W. Partain, and A. Santos. Let-floating: Moving bindings to give faster programs. In ICFP 1996: International Conference on Functional Programming, pages 1--12, 1996.

Digital Library

[17]

S. Peyton Jones, R. Leshchinskiy, G. Keller, and M. M. T. Chakravarty. Harnessing the multicores: Nested data parallelism in Haskell. In FSTTCS 2008: Foundations of Software Technology and Theoretical Computer Science, LIPIcs, pages 383--414. Schloss Dagstuhl, 2008.

[18]

J. Riely and J. Prins. Flattening is an improvement. In Proc. of the 7th International Symposium on Static Analysis, pages 360--376, 2000.

Digital Library

[19]

D. Spoonhower, G. E. Blelloch, R. Harper, and P. B. Gibbons. Space profiling for parallel functional programs. In ICFP 2008: International Conference on Functional Programming, 2008.

Digital Library

Cited By

Matsuda MFukuda KMaruyama N(2018)A Portability Layer of an All-pairs Operation for Hierarchical N-Body Algorithm Framework TapasProceedings of the International Conference on High Performance Computing in Asia-Pacific Region10.1145/3149457.3149471(241-250)Online publication date: 28-Jan-2018
https://dl.acm.org/doi/10.1145/3149457.3149471
SuB TDoring NBrinkmann ANagel L(2018)And Now for Something Completely Different: Running Lisp on GPUs2018 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER.2018.00060(434-444)Online publication date: Sep-2018
https://doi.org/10.1109/CLUSTER.2018.00060
Clifton-Everest RMcDonell TChakravarty MKeller G(2017)Streaming irregular arraysACM SIGPLAN Notices10.1145/3156695.312297152:10(174-185)Online publication date: 7-Sep-2017
https://dl.acm.org/doi/10.1145/3156695.3122971
Show More Cited By

Index Terms

Work efficient higher-order vectorisation
1. Software and its engineering
  1. Software notations and tools
    1. General programming languages
      1. Language features
        Abstract data types
        Concurrent programming structures
        Polymorphism

Recommendations

Vectorisation avoidance
Haskell '12: Proceedings of the 2012 Haskell Symposium

Flattening nested parallelism is a vectorising code transform that converts irregular nested parallelism into flat data parallelism. Although the result has good asymptotic performance, flattening thoroughly restructures the code. Many intermediate data ...
Work efficient higher-order vectorisation
ICFP '12

Existing approaches to higher-order vectorisation, also known as flattening nested data parallelism, do not preserve the asymptotic work complexity of the source program. Straightforward examples, such as sparse matrix-vector multiplication, can suffer ...
Vectorisation avoidance
Haskell '12

Flattening nested parallelism is a vectorising code transform that converts irregular nested parallelism into flat data parallelism. Although the result has good asymptotic performance, flattening thoroughly restructures the code. Many intermediate data ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICFP '12: Proceedings of the 17th ACM SIGPLAN international conference on Functional programming

September 2012

392 pages

ISBN:9781450310543

DOI:10.1145/2364527

General Chair:
Peter Thiemann
University of Freiburg, Germany
,
Program Chair:
Robby Findler
Northwestern University, USA

ACM SIGPLAN Notices Volume 47, Issue 9
ICFP '12
September 2012
368 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/2398856
Issue’s Table of Contents

Copyright © 2012 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGPLAN: ACM Special Interest Group on Programming Languages

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 September 2012

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ICFP'12

Sponsor:

SIGPLAN

ICFP'12: ACM SIGPLAN International Conference on Functional Programming

September 9 - 15, 2012

Copenhagen, Denmark

Acceptance Rates

ICFP '12 Paper Acceptance Rate 32 of 88 submissions, 36%;

Overall Acceptance Rate 333 of 1,064 submissions, 31%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

9
Total Citations
View Citations
170
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)0

Reflects downloads up to 12 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Matsuda MFukuda KMaruyama N(2018)A Portability Layer of an All-pairs Operation for Hierarchical N-Body Algorithm Framework TapasProceedings of the International Conference on High Performance Computing in Asia-Pacific Region10.1145/3149457.3149471(241-250)Online publication date: 28-Jan-2018
https://dl.acm.org/doi/10.1145/3149457.3149471
SuB TDoring NBrinkmann ANagel L(2018)And Now for Something Completely Different: Running Lisp on GPUs2018 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER.2018.00060(434-444)Online publication date: Sep-2018
https://doi.org/10.1109/CLUSTER.2018.00060
Clifton-Everest RMcDonell TChakravarty MKeller G(2017)Streaming irregular arraysACM SIGPLAN Notices10.1145/3156695.312297152:10(174-185)Online publication date: 7-Sep-2017
https://dl.acm.org/doi/10.1145/3156695.3122971
Clifton-Everest RMcDonell TChakravarty MKeller GDiatchki I(2017)Streaming irregular arraysProceedings of the 10th ACM SIGPLAN International Symposium on Haskell10.1145/3122955.3122971(174-185)Online publication date: 7-Sep-2017
https://dl.acm.org/doi/10.1145/3122955.3122971
Griffioen PLämmel R(2015)Type inference for array programming with dimensioned vector spacesProceedings of the 27th Symposium on the Implementation and Application of Functional Programming Languages10.1145/2897336.2897341(1-12)Online publication date: 14-Sep-2015
https://dl.acm.org/doi/10.1145/2897336.2897341
Madsen FFilinski AGrelck CHenglein FAcar UBerthold J(2013)Towards a streaming model for nested data parallelismProceedings of the 2nd ACM SIGPLAN workshop on Functional high-performance computing10.1145/2502323.2502330(13-24)Online publication date: 23-Sep-2013
https://dl.acm.org/doi/10.1145/2502323.2502330
Keller GChakravarty MLeshchinskiy RLippmeier BPeyton Jones S(2012)Vectorisation avoidanceACM SIGPLAN Notices10.1145/2430532.236451247:12(37-48)Online publication date: 13-Sep-2012
https://dl.acm.org/doi/10.1145/2430532.2364512
Keller GChakravarty MLeshchinskiy RLippmeier BPeyton Jones SVoigtländer J(2012)Vectorisation avoidanceProceedings of the 2012 Haskell Symposium10.1145/2364506.2364512(37-48)Online publication date: 13-Sep-2012
https://dl.acm.org/doi/10.1145/2364506.2364512
Svensson BSheeran MFilinski AGrelck C(2012)Parallel programming in Haskell almost for freeProceedings of the 1st ACM SIGPLAN workshop on Functional high-performance computing10.1145/2364474.2364477(3-14)Online publication date: 15-Sep-2012
https://dl.acm.org/doi/10.1145/2364474.2364477

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents