Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Getting to the point: index sets and parallelism-preserving autodiff for pointful array programming

Published: 19 August 2021 Publication History

Abstract

We present a novel programming language design that attempts to combine the clarity and safety of high-level functional languages with the efficiency and parallelism of low-level numerical languages. We treat arrays as eagerly-memoized functions on typed index sets, allowing abstract function manipulations, such as currying, to work on arrays. In contrast to composing primitive bulk-array operations, we argue for an explicit nested indexing style that mirrors application of functions to arguments. We also introduce a fine-grained typed effects system which affords concise and automatically-parallelized in-place updates. Specifically, an associative accumulation effect allows reverse-mode automatic differentiation of in-place updates in a way that preserves parallelism. Empirically, we benchmark against the Futhark array programming language, and demonstrate that aggressive inlining and type-driven compilation allows array programs to be written in an expressive, "pointful" style with little performance penalty.

Supplementary Material

Auxiliary Presentation Video (icfp21main-p136-p-video.mp4)
We present a novel programming language design that attempts to combine the clarity and safety of high-level functional languages with the efficiency and parallelism of low-level numerical languages. We treat arrays as eagerly-memoized functions on typed index sets, allowing abstract function manipulations, such as currying, to work on arrays. In contrast to composing primitive bulk-array operations, we argue for an explicit nested indexing style that mirrors application of functions to arguments. We also introduce a fine-grained typed effects system which affords concise and automatically-parallelized in-place updates. Specifically, an associative accumulation effect allows reverse-mode automatic differentiation of in-place updates in a way that preserves parallelism.
MP4 File (3473593.mp4)
Presentation Videos

References

[1]
Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A System for Large-Scale Machine Learning. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI’16). USENIX Association, USA. 265–283. isbn:9781931971331
[2]
Atılım Günes Baydin, Barak A. Pearlmutter, Alexey Andreyevich Radul, and Jeffrey Mark Siskind. 2017. Automatic Differentiation in Machine Learning: A Survey. J. Mach. Learn. Res., 18, 1 (2017), Jan., 5595–5637. issn:1532-4435
[3]
James Bergstra, Olivier Breuleux, Frédéric Bastien, Pascal Lamblin, Razvan Pascanu, Guillaume Desjardins, Joseph Turian, David Warde-Farley, and Yoshua Bengio. 2010. Theano: a CPU and GPU math expression compiler. In Proceedings of the Python for scientific computing conference (SciPy). 4, 1–7.
[4]
Gilbert Bernstein, Michael Mara, Tzu-Mao Li, Dougal Maclaurin, and Jonathan Ragan-Kelley. 2020. Differentiating a Tensor Language. arxiv:2008.11256.
[5]
Jeff Bezanson, Alan Edelman, Stefan Karpinski, and Viral B Shah. 2017. Julia: A fresh approach to numerical computing. SIAM review, 59, 1 (2017), 65–98. https://doi.org/10.1137/141000671
[6]
Christian Bischof, Alan Carle, George Corliss, Andreas Griewank, and Paul Hovland. 1992. ADIFOR — generating derivative codes from Fortran programs. Scientific Programming, 1, 1 (1992), 11–29. https://doi.org/10.1155/1992/717832
[7]
Guy E. Blelloch. 1993. NESL: A Nested Data-Parallel Language (Version 2.6). USA.
[8]
Uday Bondhugula, Albert Hartono, J. Ramanujam, and P. Sadayappan. 2008. A Practical Automatic Polyhedral Parallelizer and Locality Optimizer. In Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’08). Association for Computing Machinery, New York, NY, USA. 101–113. isbn:9781595938602 https://doi.org/10.1145/1375581.1375595
[9]
Jonathan Immanuel Brachthäuser, Philipp Schuster, and Klaus Ostermann. 2020. Effects as Capabilities: Effect Handlers and Lightweight Effect Polymorphism. Proc. ACM Program. Lang., 4, OOPSLA (2020), Article 126, Nov., 30 pages. https://doi.org/10.1145/3428194
[10]
James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake VanderPlas, Skye Wanderman-Milne, and Qiao Zhang. 2018. JAX: composable transformations of Python+NumPy programs. http://github.com/google/jax
[11]
Manuel M T Chakravarty, Gabriele Keller, Sean Lee, Trevor L. McDonell, and Vinod Grover. 2011. Accelerating Haskell array codes with multicore GPUs. In DAMP ’11: The 6th workshop on Declarative Aspects of Multicore Programming. ACM. https://doi.org/10.1145/1926354.1926358
[12]
Shuai Che, Michael Boyer, Jiayuan Meng, David Tarjan, Jeremy W Sheaffer, Sang-Ha Lee, and Kevin Skadron. 2009. Rodinia: A benchmark suite for heterogeneous computing. In 2009 IEEE international symposium on workload characterization (IISWC). 44–54. https://doi.org/10.1109/IISWC.2009.5306797
[13]
Conal Elliott. 2018. The Simple Essence of Automatic Differentiation. Proc. ACM Program. Lang., 2, ICFP (2018), Article 70, July, 29 pages. https://doi.org/10.1145/3236765
[14]
Roy Frostig, Matthew Johnson, Dougal Maclaurin, Adam Paszke, and Alexey Radul. 2021. Decomposing reverse-mode automatic differentiation. In LAFI ’21: POPL 2021 workshop on Languages for Inference.
[15]
Andreas Griewank and Andrea Walther. 2008. Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation (second ed.). Society for Industrial and Applied Mathematics, USA. isbn:0898716594 https://doi.org/10.1137/1.9780898717761
[16]
Tobias Grosser, Armin Größ linger, and C. Lengauer. 2012. Polly - Performing Polyhedral Optimizations on a Low-Level Intermediate Representation. Parallel Process. Lett., 22 (2012), https://doi.org/10.1142/S0129626412500107
[17]
Charles R. Harris, K. Jarrod Millman, St’efan J. van der Walt, Ralf Gommers, Pauli Virtanen, David Cournapeau, Eric Wieser, Julian Taylor, Sebastian Berg, Nathaniel J. Smith, Robert Kern, Matti Picus, Stephan Hoyer, Marten H. van Kerkwijk, Matthew Brett, Allan Haldane, Jaime Fern’andez del R’ıo, Mark Wiebe, Pearu Peterson, Pierre G’erard-Marchant, Kevin Sheppard, Tyler Reddy, Warren Weckesser, Hameer Abbasi, Christoph Gohlke, and Travis E. Oliphant. 2020. Array programming with NumPy. Nature, 585, 7825 (2020), Sept., 357–362. https://doi.org/10.1038/s41586-020-2649-2
[18]
Laurent Hascoet and Valérie Pascual. 2013. The Tapenade automatic differentiation tool: principles, model, and specification. ACM Transactions on Mathematical Software (TOMS), 39, 3 (2013), 1–43. https://doi.org/10.1145/2450153.2450158
[19]
Troels Henriksen, Sune Hellfritzsch, Ponnuswamy Sadayappan, and Cosmin Oancea. 2020. Compiling Generalized Histograms for GPU. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC ’20). IEEE Press, Article 97, 14 pages. isbn:9781728199986 https://doi.org/10.1109/SC41405.2020.00101
[20]
Troels Henriksen, Ken Friis Larsen, and Cosmin E. Oancea. 2016. Design and GPGPU Performance of Futhark’s Redomap Construct. In Proceedings of the 3rd ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming (ARRAY 2016). Association for Computing Machinery, New York, NY, USA. 17–24. isbn:9781450343848 https://doi.org/10.1145/2935323.2935326
[21]
Troels Henriksen, Niels GW Serup, Martin Elsman, Fritz Henglein, and Cosmin E Oancea. 2017. Futhark: purely functional GPU-programming with nested parallelism and in-place array updates. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation. 556–571. https://doi.org/10.1145/3062341.3062354
[22]
Anders Kiel Hovgaard, Troels Henriksen, and Martin Elsman. 2018. High-Performance Defunctionalisation in Futhark. In International Symposium on Trends in Functional Programming. 136–156. https://doi.org/10.1007/978-3-030-18506-0_7
[23]
Yuanming Hu, Luke Anderson, Tzu-Mao Li, Qi Sun, Nathan Carr, Jonathan Ragan-Kelley, and Fredo Durand. 2020. DiffTaichi: Differentiable Programming for Physical Simulation. In International Conference on Learning Representations. https://openreview.net/forum?id=B1eB5xSFvr
[24]
Yuanming Hu, Tzu-Mao Li, Luke Anderson, Jonathan Ragan-Kelley, and Frédo Durand. 2019. Taichi: A Language for High-Performance Computation on Spatially Sparse Data Structures. ACM Trans. Graph., 38, 6 (2019), Article 201, Nov., 16 pages. issn:0730-0301 https://doi.org/10.1145/3355089.3356506
[25]
Jan Hückelheim, Navjot Kukreja, Sri Hari Krishna Narayanan, Fabio Luporini, Gerard Gorman, and Paul Hovland. 2019. Automatic differentiation for adjoint stencil loops. In Proceedings of the 48th International Conference on Parallel Processing. 1–10. https://doi.org/10.1145/3337821.3337906
[26]
Michael Innes. 2018. Don’t Unroll Adjoint: Differentiating SSA-Form Programs. CoRR, abs/1810.07951 (2018), arxiv:1810.07951.
[27]
Kenneth E. Iverson. 1962. A Programming Language. John Wiley & Sons, Inc., USA. isbn:978-0-471-43014-8
[28]
Rasmus Wriedt Larsen and Troels Henriksen. 2017. Strategies for Regular Segmented Reductions on GPU. In Proceedings of the 6th ACM SIGPLAN International Workshop on Functional High-Performance Computing (FHPC 2017). Association for Computing Machinery, New York, NY, USA. 42–52. isbn:9781450351812 https://doi.org/10.1145/3122948.3122952
[29]
John Launchbury and Simon L. Peyton Jones. 1994. Lazy Functional State Threads. In Proceedings of the ACM SIGPLAN 1994 Conference on Programming Language Design and Implementation (PLDI ’94). Association for Computing Machinery, New York, NY, USA. 24–35. isbn:089791662X https://doi.org/10.1145/178243.178246
[30]
Daan Leijen. 2014. Koka: Programming with Row Polymorphic Effect Types. Electronic Proceedings in Theoretical Computer Science, 153 (2014), Jun, 100–126. issn:2075-2180 https://doi.org/10.4204/eptcs.153.8
[31]
Tzu-Mao Li, Michaël Gharbi, Andrew Adams, Frédo Durand, and Jonathan Ragan-Kelley. 2018. Differentiable programming for image processing and deep learning in Halide. ACM Trans. Graph. (Proc. SIGGRAPH), 37, 4 (2018), 139:1–139:13. https://doi.org/10.1145/3197517.3201383
[32]
Dougal Maclaurin, David Duvenaud, and Ryan P Adams. 2014. Autograd: Effortless gradients in numpy. ICML ’15 AutoML workshop.
[33]
Oleksandr Manzyuk, Barak A. Pearlmutter, Alexey Andreyevich Radul, David R. Rush, and Jeffrey Mark Siskind. 2019. Perturbation confusion in forward automatic differentiation of higher-order functions. Journal of Functional Programming, 29 (2019), e12. https://doi.org/10.1017/S095679681900008X
[34]
Kiminori Matsuzaki and Kento Emoto. 2010. Implementing Fusion-Equipped Parallel Skeletons by Expression Templates. In Implementation and Application of Functional Languages, Marco T. Morazán and Sven-Bodo Scholz (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg. 72–89. isbn:978-3-642-16478-1 https://doi.org/10.1007/978-3-642-16478-1_5
[35]
Trevor L. McDonell, Manuel M T Chakravarty, Gabriele Keller, and Ben Lippmeier. 2013. Optimising Purely Functional GPU Programs. In ICFP ’13: The 18th ACM SIGPLAN International Conference on Functional Programming. ACM. https://doi.org/10.1145/2500365.2500595
[36]
Robin Milner, Mads Tofte, and David Macqueen. 1997. The Definition of Standard ML. MIT Press, Cambridge, MA, USA. isbn:0262631814
[37]
Neil Mitchell. 2010. Rethinking Supercompilation. In Proceedings of the 15th ACM SIGPLAN International Conference on Functional Programming (ICFP ’10). Association for Computing Machinery, New York, NY, USA. 309–320. isbn:9781605587943 https://doi.org/10.1145/1863543.1863588
[38]
Shayan Najd, Sam Lindley, Josef Svenningsson, and Philip Wadler. 2016. Everything Old is New Again: Quoted Domain-Specific Languages. In Proceedings of the 2016 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation (PEPM ’16). Association for Computing Machinery, New York, NY, USA. 25–36. isbn:9781450340977 https://doi.org/10.1145/2847538.2847541
[39]
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Köpf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada. 8024–8035.
[40]
Barak A Pearlmutter and Jeffrey Mark Siskind. 2008. Reverse-mode AD in a functional framework: Lambda the ultimate backpropagator. ACM Transactions on Programming Languages and Systems (TOPLAS), 30, 2 (2008), 1–36. https://doi.org/10.1145/1330017.1330018
[41]
Simon Peyton Jones. 2008. Harnessing the Multicores: Nested Data Parallelism in Haskell. In Proceedings of the 6th Asian Symposium on Programming Languages and Systems (APLAS ’08). Springer-Verlag, Berlin, Heidelberg. 138. isbn:9783540893295 https://doi.org/10.1007/978-3-540-89330-1_10
[42]
Simon Peyton Jones and Simon Marlow. 2002. Secrets of the Glasgow Haskell Compiler Inliner. J. Funct. Program., 12, 5 (2002), July, 393–434. issn:0956-7968 https://doi.org/10.1017/S0956796802004331
[43]
Simon Peyton Jones, Dimitrios Vytiniotis, Stephanie Weirich, and Mark Shields. 2007. Practical Type Inference for Arbitrary-Rank Types. J. Funct. Program., 17, 1 (2007), Jan., 1–82. issn:0956-7968 https://doi.org/10.1017/S0956796806006034
[44]
Jonathan Ragan-Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Frédo Durand, and Saman Amarasinghe. 2013. Halide: A Language and Compiler for Optimizing Parallelism, Locality, and Recomputation in Image Processing Pipelines. SIGPLAN Not., 48, 6 (2013), June, 519–530. issn:0362-1340 https://doi.org/10.1145/2499370.2462176
[45]
Sam Ritchie and Gerald Jay Sussman. 2021. AD on Higher Order Functions. Unpublished note.
[46]
Jared Roesch, Steven Lyubomirsky, Logan Weber, Josh Pollock, Marisa Kirisame, Tianqi Chen, and Zachary Tatlock. 2018. Relay: A new IR for machine learning frameworks. In Proceedings of the 2nd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages. 58–68. https://doi.org/10.1145/3211346.3211348
[47]
Amir Shaikhha, Andrew Fitzgibbon, Dimitrios Vytiniotis, and Simon Peyton Jones. 2019. Efficient Differentiable Programming in a Functional Array-Processing Language. Proc. ACM Program. Lang., 3, ICFP (2019), Article 97, July, 30 pages. https://doi.org/10.1145/3341701
[48]
Justin Slepak, Olin Shivers, and Panagiotis Manolios. 2014. An Array-Oriented Language with Static Rank Polymorphism. In Proceedings of the 23rd European Symposium on Programming Languages and Systems - Volume 8410. Springer-Verlag, Berlin, Heidelberg. 27–46. isbn:9783642548321 https://doi.org/10.1007/978-3-642-54833-8_3
[49]
Guy L. Steele, Eric Allen, David Chase, Christine Flood, Victor Luchangco, Jan-Willem Maessen, and Sukyoung Ryu. 2011. Fortress (Sun HPCS Language). Springer US, Boston, MA. 718–735. isbn:978-0-387-09766-4 https://doi.org/10.1007/978-0-387-09766-4_190
[50]
Michel Steuwer, Toomas Remmelg, and Christophe Dubach. 2017. Lift: A Functional Data-Parallel IR for High-Performance GPU Code Generation. In Proceedings of the 2017 International Symposium on Code Generation and Optimization (CGO ’17). IEEE Press, 74–85. isbn:9781509049318 https://doi.org/10.1109/CGO.2017.7863730
[51]
J. A. Stratton, Christopher I. Rodrigues, I-Jui Sung, Nady Obeid, Li-Wen Chang, N. Anssari, G. Liu, and W. Hwu. 2012. Parboil: A Revised Benchmark Suite for Scientific and Commercial Throughput Computing.
[52]
Nikhil Swamy, Juan Chen, Cédric Fournet, Pierre-Yves Strub, Karthikeyan Bhargavan, and Jean Yang. 2011. Secure Distributed Programming with Value-Dependent Types. In Proceedings of the 16th ACM SIGPLAN International Conference on Functional Programming (ICFP ’11). Association for Computing Machinery, New York, NY, USA. 266–278. isbn:9781450308656 https://doi.org/10.1145/2034773.2034811
[53]
Seiya Tokui, Ryosuke Okuta, Takuya Akiba, Yusuke Niitani, Toru Ogawa, Shunta Saito, Shuji Suzuki, Kota Uenishi, Brian Vogel, and Hiroyuki Yamazaki Vincent. 2019. Chainer: A deep learning framework for accelerating the research cycle. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2002–2011. https://doi.org/10.1145/3292500.3330756
[54]
Nicolas Vasilache, Oleksandr Zinenko, Theodoros Theodoridis, Priya Goyal, Zachary DeVito, William S. Moses, Sven Verdoolaege, Andrew Adams, and Albert Cohen. 2018. Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions. arxiv:1802.04730.

Cited By

View all
  • (2024)Parallel Algebraic Effect HandlersProceedings of the ACM on Programming Languages10.1145/36746518:ICFP(756-788)Online publication date: 15-Aug-2024
  • (2024)(De/Re)-Composition of Data-Parallel Computations via Multi-Dimensional HomomorphismsACM Transactions on Programming Languages and Systems10.1145/366564346:3(1-74)Online publication date: 10-Oct-2024
  • (2024)Efficient CHADProceedings of the ACM on Programming Languages10.1145/36328788:POPL(1060-1088)Online publication date: 5-Jan-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Programming Languages
Proceedings of the ACM on Programming Languages  Volume 5, Issue ICFP
August 2021
1011 pages
EISSN:2475-1421
DOI:10.1145/3482883
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 August 2021
Published in PACMPL Volume 5, Issue ICFP

Permissions

Request permissions for this article.

Check for updates

Badges

  • Distinguished Paper

Author Tags

  1. Array programming
  2. automatic differentiation
  3. parallel computing

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)387
  • Downloads (Last 6 weeks)35
Reflects downloads up to 10 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Parallel Algebraic Effect HandlersProceedings of the ACM on Programming Languages10.1145/36746518:ICFP(756-788)Online publication date: 15-Aug-2024
  • (2024)(De/Re)-Composition of Data-Parallel Computations via Multi-Dimensional HomomorphismsACM Transactions on Programming Languages and Systems10.1145/366564346:3(1-74)Online publication date: 10-Oct-2024
  • (2024)Efficient CHADProceedings of the ACM on Programming Languages10.1145/36328788:POPL(1060-1088)Online publication date: 5-Jan-2024
  • (2024)A Tensor Algebra Compiler for Sparse DifferentiationProceedings of the 2024 IEEE/ACM International Symposium on Code Generation and Optimization10.1109/CGO57630.2024.10444787(1-12)Online publication date: 2-Mar-2024
  • (2024)A framework for higher-order effects & handlersScience of Computer Programming10.1016/j.scico.2024.103086234:COnline publication date: 25-Jun-2024
  • (2023)SLANG.D: Fast, Modular and Differentiable Shader ProgrammingACM Transactions on Graphics10.1145/361835342:6(1-28)Online publication date: 5-Dec-2023
  • (2023)Infix-Extensible Record Types for Tabular DataProceedings of the 8th ACM SIGPLAN International Workshop on Type-Driven Development10.1145/3609027.3609406(29-43)Online publication date: 30-Aug-2023
  • (2023)Quantum Computing with Differentiable Quantum TransformsACM Transactions on Quantum Computing10.1145/35926224:3(1-20)Online publication date: 26-Jun-2023
  • (2023)Better Defunctionalization through Lambda Set SpecializationProceedings of the ACM on Programming Languages10.1145/35912607:PLDI(977-1000)Online publication date: 6-Jun-2023
  • (2023)(De/Re)-Compositions Expressed Systematically via MDH-Based SchedulesProceedings of the 32nd ACM SIGPLAN International Conference on Compiler Construction10.1145/3578360.3580269(61-72)Online publication date: 17-Feb-2023
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media