Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3426428.3426915acmconferencesArticle/Chapter ViewAbstractPublication PagessplashConference Proceedingsconference-collections
research-article
Public Access

Intrepydd: performance, productivity, and portability for data science application kernels

Published: 17 November 2020 Publication History

Abstract

Major simultaneous disruptions are currently under way in both hardware and software. In hardware, ``extreme heterogeneity'' has become critical to sustaining cost and performance improvements after Moore's Law, but poses productivity and portability challenges for developers. In software, the rise of large-scale data science is driven by developers who come from diverse backgrounds and, moreover, who demand the rapid prototyping and interactive-notebook capabilities of high-productivity languages like Python.
We introduce the Intrepydd programming system, which enables data scientists to write application kernels with high performance, productivity, and portability on current and future hardware. Intrepydd is based on Python, though the approach can be applied to other base languages as well. To deliver high performance, the Intrepydd toolchain uses ahead-of-time (AOT) compilation and high-level compiler optimizations of Intrepydd kernels. Intrepydd achieves portability by its ability to compile kernels for execution on different hardware platforms, and for invocation from Python or C++ main programs.
An empirical evaluation shows significant performance improvements relative to Python, and the suitability of Intrepydd for mapping on to post-Moore accelerators and architectures with relative ease. We believe that Intrepydd represents a new direction of ``Discipline-Aware Languages'' (DiALs), which brings us closer to the holy grail of obtaining productivity and portability with higher performance than current Python-like languages, and with more generality than current domain-specific languages and libraries.

Supplementary Material

Auxiliary Archive (onward20papers-p32-p-archive.zip)
This supplemental document shows the appendices of the main article: A Optimization Algorithms (A.1 Loop Invariant Code Motion Algorithm, A.2 Dense Element-wise Operation Fusion Algorithm, A.3 Sparsity Optimization Algorithm, and A.4 Allocation Hoisting Algorithm) and B Benchmark Kernel Codes.
Auxiliary Presentation Video (onward20papers-p32-p-video.mp4)
This is a presentation of our talk at Onward 2020. We introduce the Intrepydd programming system, which enables data scientists to write application kernels with high performance, productivity, and portability on current and future hardware. To deliver high performance, the Intrepydd toolchain uses ahead-of-time (AOT) compilation and high-level compiler optimizations. Intrepydd achieves portability by its ability to compile kernels for execution on different hardware platforms, and for invocation from Python or C++ main programs. An empirical evaluation shows significant performance improvements relative to Python, and the suitability of Intrepydd for mapping on to post-Moore accelerators and architectures with relative ease. We believe that Intrepydd represents a new direction of ``Discipline-Aware Languages'' (DiALs), which brings us closer to the holy grail of obtaining productivity and portability and more generality than current domain-specific languages and libraries.
MP4 File (3426428.3426915.mp4)
Presentation Videos

References

[1]
[n. d.]. 3.3.8. Emulating numeric types. https://docs.python.org/3/reference/datamodel.html#emulatingnumeric-types.
[2]
1991. Python. https://www.python.org/.
[3]
1993. The R Project for Statistical Computing. https://www.rproject.org/.
[4]
2001. multiprocessing-Process-based parallelism. https://docs.python.org/3/library/multiprocessing.html.
[5]
2001. SciPy. https://www.scipy.org/.
[6]
2006. NumPy. https://numpy.org/.
[7]
2007. Cython. https://cython.org/.
[8]
2007. PyTorch. https://pytorch.org/.
[9]
2007. scikit-learn. https://scikit-learn.org/stable/.
[10]
2012. Nuitka. https://nuitka.net/pages/overview.html.
[11]
2012. Numba. https://numba.pydata.org/.
[12]
2012. Shed Skin. https://shedskin.github.io/.
[13]
2014. Pyston. https://blog.pyston.org/.
[14]
2015. pybind. https://pybind11.readthedocs.io/en/stable/.
[15]
2015. TensorFlow: an end-to-end open source machine learning platform. https://www.tensorflow.org/.
[16]
2018. Julia Micro-Benchmarks. https://julialang.org/benchmarks/.
[17]
2019. PyPy. https://pypy.org/.
[18]
2019. Python Typed AST Package. https://pypi.org/project/typed-ast/.
[19]
2019. Top languages. https://octoverse.github.com/.
[20]
2020. Applications Benchmarking chapter, IEEE International Roadmap for Devices and Systems. https://irds.ieee.org.
[21]
2020. OpenBLAS: An optimized BLAS library. htps://www.openblas. net/, Version 0.3.10.
[22]
Martin Alnaes, Jan Blechta, Johan Hake, August Johansson, Benjamin Kehlet, Anders Logg, Chris Richardson, Johannes Ring, Marie E. Rognes, and Garth N. Wells. 2015. The FEniCS Project Version 1.5. Archive of Numerical Software 3, 100 (12 2015 ). htps://doi.org/10. 11588/ans. 2015. 100.20553
[23]
Karl Anderson and Steve Plimpton. 2015. FireHose Streaming Benchmarks. Technical Report. Sandia National Laboratory.
[24]
Håkan Ardö, Carl Friedrich Bolz, and Maciej FijaBkowski. 2012. LoopAware Optimizations in PyPy's Tracing JIT. SIGPLAN Not. 48, 2 (Oct. 2012 ), 63-72. htps://doi.org/10.1145/2480360.2384586
[25]
Håkan Ardö, Carl Friedrich Bolz, and Maciej FijaBkowski. 2012. LoopAware Optimizations in PyPy's Tracing JIT. In Proceedings of the 8th Symposium on Dynamic Languages (DLS '12). Association for Computing Machinery, New York, NY, USA, 63-72. htps://doi.org/10.1145/ 2384577.2384586
[26]
S. Behnel, R. Bradshaw, C. Citro, L. Dalcin, D. S. Seljebotn, and K. Smith. 2011. Cython: The Best of Both Worlds. Computing in Science Engineering 13, 2 (March 2011 ), 31-39. htps://doi.org/10.1109/ MCSE. 2010.118
[27]
James Bergstra, Olivier Breuleux, Frederic Bastien, Pascal Lamblin, Razvan Pascanu, Guillaume Desjardins, Joseph Turian, David Wardefarley, and Yoshua Bengio. 2010. Theano: A CPU and GPU math compiler in python. In Proceedings of the 9th Python in Science Conference. 3-10.
[28]
Jef Bezanson, Stefan Karpinski, Viral B. Shah, and Alan Edelman. 2012. Julia: A Fast Dynamic Language for Technical Computing. CoRR abs/1209.5145 ( 2012 ). arXiv: 1209.5145 htp://arxiv.org/abs/1209.5145
[29]
Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Keith H. Randall, and Yuli Zhou. 1995. Cilk: An Eficient Multithreaded Runtime System. In Proceedings of the Fifth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPOPP '95). Association for Computing Machinery, New York, NY, USA, 207-216. htps://doi.org/10.1145/209936.209958
[30]
B.L. Chamberlain, D. Callahan, and H.P. Zima. 2007. Parallel Programmability and the Chapel Language. Int. J. High Perform. Comput. Appl. 21, 3 ( 2007 ), 291-312.
[31]
Philippe Charles, Christian Grothof, Vijay Saraswat, Christopher Donawa, Allan Kielstra, Kemal Ebcioglu, Christoph von Praun, and Vivek Sarkar. 2005. X10: An Object-Oriented Approach to Non-Uniform Cluster Computing. In Proceedings of the 20th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA '05). Association for Computing Machinery, New York, NY, USA, 519-538. htps://doi.org/10.1145/1094811.1094852
[32]
Prasanth Chatarasi and Vivek Sarkar. 2018. A Preliminary Study of Compiler Transformations for Graph Applications on the Emu System. In Proceedings of the Workshop on Memory Centric High Performance Computing (MCHPC '18). Association for Computing Machinery, New York, NY, USA, 37-44. htps://doi.org/10.1145/3286475.3286481
[33]
T. M. Conte, E. P. DeBenedictis, P. A. Gargini, and E. Track. 2017. Rebooting Computing: The Road Ahead. Computer 50, 1 ( 2017 ), 20-29.
[34]
Timothy A Davis. 2018. Graph algorithms via SuiteSparse: GraphBLAS: triangle counting and k-truss. In 2018 IEEE High Performance extreme Computing Conference (HPEC). IEEE, 1-6.
[35]
Mehmet Deveci, Christian Trott, and Sivasankaran Rajamanickam. 2017. Performance-portable sparse matrix-matrix multiplication for many-core architectures. In Parallel and Distributed Processing Symposium Workshops (IPDPSW), 2017 IEEE International. IEEE, 693-702.
[36]
Marat Dukhan. 2020. Indirect deconvolution algorithm. In Proceedings of the IPDPS'20 Workshop on Parallel AI and Systems for the Edge (PAISE). htps://doi.org/10.1109/IPDPSW50202. 2020.00154
[37]
Marat Dukhan and Artsiom Ablavatski. 2020. Two-pass softmax algorithm. In Proceedings of the IPDPS'20 Workshop on High-Performance Big Data and Cloud Computing (HPBDC). htps://doi.org/10.1109/ IPDPSW50202. 2020.00074
[38]
S.C. Eisenstat, M.C. Gursky, M.H. Schulz, and A.H. Sherman. 1977. Yale Sparse Matrix Package: I. The symmetric codes. Technical Report RR-112. Yale University. htps://apps.dtic.mil/dtic/tr/fulltext/u2/a047724.pdf
[39]
E. Hein, T. Conte, J. Young, S. Eswar, J. Li, P. Lavin, R. Vuduc, and J. Riedy. 2018. An Initial Characterization of the Emu Chick. In 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). 579-588.
[40]
J. Hückelheim, Z. Luo, F. Luporini, N. Kukreja, M. Lange, G. Gorman, S. Siegel, M. Dwyer, and P. Hovland. 2017. Towards SelfVerification in Finite Diference Code Generation. In Proceedings of Correctness'17: First International Workshop on Software Correctness for HPC Applications (Correctness'17). ACM, New York, NY, USA. htps://doi.org/10.1145/3145344.3145488
[41]
Shams Imam, Jisheng Zhao, and Vivek Sarkar. 2015. A Composable Deadlock-Free Approach to Object-Based Isolation. In Euro-Par 2015: Parallel Processing, Jesper Larsson Träf, Sascha Hunold, and Francesco Versaci (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 426-437.
[42]
Intel. 2018. Intel Math Kernel Library. https://software.intel.com/mkl.
[43]
Ben Johnson. 2019. graph-changepoint. htps://github.com/bkj/graphchangepoint.
[44]
Ben Johnson. 2019. ipnsw. htps://github.com/prog-eval/prog-eval/ tree/master/ipnsw.
[45]
Ben Johnson. 2019. lgc. htps://github.com/prog-eval/prog-eval/tree/ master/lgc.
[46]
Ben Johnson. 2019. sinkhorn-wmd. htps://github.com/prog-eval/progeval/tree/master/sinkhorn_wmd.
[47]
Ken Kennedy, Bradley Bloom, Keith Cooper, Jack Dongarra, Rob Fowler, Dennis Gannon, Lennart Johnson, John Mellor-Crummey, and Linda Torczon. 2001. Telescoping Languages: A strategy for automatic generation of scientific problem-solving systems from annotated libraries. Journal of Parallel and Distributed Computing (JPDC) 61, 12 (12 2001 ), 1803-1826. htps://doi.org/10.1006/jpdc. 2001.1724
[48]
Jeremy Kepner, Peter Aaltonen, David Bader, Aydin Buluç, Franz Franchetti, John Gilbert, Dylan Hutchison, Manoj Kumar, Andrew Lumsdaine, Henning Meyerhenke, et al. 2016. Mathematical foundations of the GraphBLAS. In 2016 IEEE High Performance Extreme Computing Conference (HPEC). IEEE, 1-9.
[49]
Fredrik Kjolstad, Shoaib Kamil, Stephen Chou, David Lugato, and Saman Amarasinghe. 2017. The Tensor Algebra Compiler. Proc. ACM Program. Lang. 1, OOPSLA, Article 77 (Oct. 2017 ), 29 pages. htps: //doi.org/10.1145/3133901
[50]
Thomas Kluyver, Benjamin Ragan-Kelley, Fernando Pérez, Brian E Granger, Matthias Bussonnier, Jonathan Frederic, Kyle Kelley, Jessica B Hamrick, Jason Grout, Sylvain Corlay, et al. 2016. Jupyter Notebooksa publishing format for reproducible computational workflows. In Positioning and Power in Academic Publishing: Players, Agents and Agendas. IOS Press. htps://doi.org/10.3233/978-1-61499-649-1-87
[51]
Siu Kwan Lam, Antoine Pitrou, and Stanley Seibert. 2015. Numba: A LLVM-based Python JIT Compiler. In Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC (LLVM '15). ACM, New York, NY, USA, Article 7, 6 pages. htps://doi.org/10.1145/ 2833157.2833162
[52]
Siu Kwan Lam, Antoine Pitrou, and Stanley Seibert. 2015. Numba: A LLVM-based Python JIT Compiler. In Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC (LLVM '15). ACM, New York, NY, USA, Article 7, 6 pages. htps://doi.org/10.1145/ 2833157.2833162
[53]
M. Lange, N. Kukreja, M. Louboutin, F. Luporini, F. Vieira, V. Pandolfo, P. Velesko, P. Kazakas, and G. Gorman. 2016. Devito: Towards a generic Finite Diference DSL using Symbolic Python. In Proceedings of the Workshop on Python for High-Performance and Scientific Computing (PyHPC). IEEE, 67-75. arXiv: 1609. 03361.
[54]
Guoping Long, Jun Yang, Kai Zhu, and Wei Lin. 2018. FusionStitching: Deep Fusion and Code Generation for Tensorflow Computations on GPUs. CoRR abs/ 1811.05213 ( 2018 ). arXiv: 1811.05213 htp://arxiv.org/ abs/ 1811.05213
[55]
MATLAB. 2019. version 9.7 (R2019b). The MathWorks Inc., Natick, Massachusetts.
[56]
Nimrod Megiddo and Vivek Sarkar. 1997. Optimal Weighted Loop Fusion for Parallel Programs. In Proceedings of the Ninth Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA '97). Association for Computing Machinery, New York, NY, USA, 282âĂŞ291. htps://doi.org/10.1145/258492.258520
[57]
Marjan Mernik, Jan Heering, and Anthony M. Sloane. 2005. When and how to develop domain-specific languages. ACM Computing Surveys (CSUR) 37, 4 ( 12 2005 ). htps://doi.org/10.1145/1118890.1118892
[58]
Philipp Moritz, Robert Nishihara, Stephanie Wang, Alexey Tumanov, Richard Liaw, Eric Liang, Melih Elibol, Zongheng Yang, William Paul, Michael I. Jordan, and Ion Stoica. 2018. Ray: A Distributed Framework for Emerging AI Applications. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI'18). USENIX Association, Berkeley, CA, USA, 561-577. htp://dl.acm.org/citation.cfm?id= 3291168. 3291210
[59]
OpenMP [n. d.]. OpenMP Specifications. http://openmp.org/wp/openmp-specifications.
[60]
Dominic Orchard, Mistral Contrastin, Matthew Danish, and Andrew Rice. 2017. Verifying spatial properties of array computations. In Proc. OOPSLA. htps://doi.org/10.1145/3133899
[61]
Shoumik Palkar, James J. Thomas, Anil Shanbhag, Deepak Narayanan, Holger Pirk, Malte Schwarzkopf, Saman Amarasinghe, and Matei Zaharia. 2017. Weld: A Common Runtime for High Performance Data Analytics. In 8th Biennial Conference on Innovative Data Systems Research (CIDR'17).
[62]
Jonathan Ragan-Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Frédo Durand, and Saman Amarasinghe. 2013. Halide: A Language and Compiler for Optimizing Parallelism, Locality, and Recomputation in Image Processing Pipelines. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '13). Association for Computing Machinery, New York, NY, USA, 12. htps://doi.org/10.1145/2491956.2462176
[63]
Rob Romijnders. 2017. bigclam. htps://github.com/RobRomijnders/ bigclam.
[64]
Hongbo Rong, Jongsoo Park, Lingxiang Xiang, Todd A. Anderson, and Mikhail Smelyanskiy. 2016. Sparso: Context-driven Optimizations of Sparse Linear Algebra. In Proceedings of the 2016 International Conference on Parallel Architectures and Compilation (PACT '16). ACM, New York, NY, USA, 247-259. htps://doi.org/10.1145/2967938.2967943
[65]
Nobou Sato and W.F. Tinney. 1963. Techniques for exploiting the sparsity or the network admittance matrix. IEEE Transactions on Power Apparatus and Systems 82, 69 (12 1963 ), 944-950. htps://doi.org/10. 1109/TPAS. 1963.291477
[66]
Daniele G. Spampinato and Markus Püschel. 2014. A Basic Linear Algebra Compiler. In Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO '14). ACM, New York, NY, USA, Article 23, 10 pages. htps://doi.org/10.1145/ 2581122.2544155
[67]
Sriseshan Srikanth, Anirudh Jain, Joseph M Lennon, Thomas M Conte, Erik Debenedictis, and Jeanine Cook. 2019. MetaStrider: Architectures for Scalable Memory-centric Reduction of Sparse Data Streams. ACM Transactions on Architecture and Code Optimization (TACO) 16, 4 ( 2019 ), 1-26.
[68]
Sriseshan Srikanth, Lavanya Subramanian, Sreenivas Subramoney, Thomas M Conte, and Hong Wang. 2018. Tackling memory access latency through DRAM row management. In Proceedings of the International Symposium on Memory Systems. 137-147.
[69]
Michel Steuwer, Toomas Remmelg, and Christophe Dubach. 2017. Lift: A Functional Data-parallel IR for High-performance GPU Code Generation. In Proceedings of the 2017 International Symposium on Code Generation and Optimization (CGO '17). IEEE Press, Piscataway, NJ, USA, 74-85. htp://dl.acm.org/citation.cfm?id= 3049832. 3049841
[70]
Theano Development Team. 2016. Theano: A Python framework for fast computation of mathematical expressions. arXiv e-prints abs/1605.02688 (May 2016 ). htp://arxiv.org/abs/1605.02688
[71]
S. van der Walt, S. C. Colbert, and G. Varoquaux. 2011. The NumPy Array: A Structure for Eficient Numerical Computation. Computing in Science Engineering 13, 2 (March 2011 ), 22-30. htps://doi.org/10. 1109/ MCSE. 2011.37
[72]
Field G. Van Zee and Robert A. van de Geijn. 2015. BLIS: A Framework for Rapidly Instantiating BLAS Functionality. ACM Trans. Math. Software 41, 3 ( June 2015 ), 14 : 1-14 : 33. htp://doi.acm.org/10.1145/2764454
[73]
Qian Wang, Xianyi Zhang, Yunquan Zhang, and Qing Yi. 2013. AUGEM: Automatically Generate High Performance Dense Linear Algebra Kernels on x86 CPUs. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC '13). ACM, New York, NY, USA, Article 25, 12 pages. htps://doi.org/10.1145/2503210.2503219
[74]
R. Clint Whaley and Jack J. Dongarra. 1998. Automatically Tuned Linear Algebra Software. In Proceedings of the 1998 ACM/IEEE Conference on Supercomputing (SC '98). IEEE Computer Society, Washington, DC, USA, 1-27. htp://dl.acm.org/citation.cfm?id= 509058. 509096
[75]
Carole-Jean Wu, David Brooks, Kevin Chen, Douglas Chen, Sy Choudhury, Marat Dukhan, Kim Hazelwood, Eldad Isaac, Yangqing Jia, Bill Jia, Tommer Leyvand, Hao Lu, Yang Lu, Lin Qiao, Brandon Reagen, Joe Spisak, Fei Sun, Andrew Tulloch, Peter Vajda, Xiaodong Wang, Yanghan Wang, Bram Wasti, Yiming Wu, Ran Xian, Sungjoo Yoo, and Peizhao Zhang. 2019. Machine learning at Facebook: understanding inference at the edge. In Proceedings of the 2019 IEEE International Symposium on High-Performance Computer Architecture (HPCA). htps://doi.org/10.1109/HPCA. 2019.00048
[76]
Jefrey S. Young, Eric Hein, Srinivas Eswar, Patrick Lavin, Jiajia Li, Jason Riedy, Richard Vuduc, and Tom Conte. 2019. A Microbenchmark Characterization of the Emu Chick. Parallel Comput. 87 ( 2019 ), 60-69. htps://doi.org/10.1016/j.parco. 2019. 04.012
[77]
Yunming Zhang, Mengjiao Yang, Riyadh Baghdadi, Shoaib Kamil, Julian Shun, and Saman Amarasinghe. 2018. GraphIt: A Highperformance Graph DSL. Proc. ACM Program. Lang. 2, OOPSLA, Article 121 (Oct. 2018 ), 30 pages. htps://doi.org/10.1145/3276491

Cited By

View all
  • (2023)Concrete Type Inference for Code Optimization using Machine Learning with SMT SolvingProceedings of the ACM on Programming Languages10.1145/36228257:OOPSLA2(773-800)Online publication date: 16-Oct-2023
  • (2023)Data Flow Lifecycles for Optimizing Workflow CoordinationProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607104(1-15)Online publication date: 12-Nov-2023
  • (2022)Automatic Parallelization of Python Programs for Distributed Heterogeneous ComputingEuro-Par 2022: Parallel Processing10.1007/978-3-031-12597-3_22(350-366)Online publication date: 1-Aug-2022
  • Show More Cited By

Index Terms

  1. Intrepydd: performance, productivity, and portability for data science application kernels

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    Onward! 2020: Proceedings of the 2020 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software
    November 2020
    208 pages
    ISBN:9781450381789
    DOI:10.1145/3426428
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 17 November 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Python
    2. application kernels
    3. compilers

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    SPLASH '20
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 40 of 105 submissions, 38%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)223
    • Downloads (Last 6 weeks)47
    Reflects downloads up to 26 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Concrete Type Inference for Code Optimization using Machine Learning with SMT SolvingProceedings of the ACM on Programming Languages10.1145/36228257:OOPSLA2(773-800)Online publication date: 16-Oct-2023
    • (2023)Data Flow Lifecycles for Optimizing Workflow CoordinationProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607104(1-15)Online publication date: 12-Nov-2023
    • (2022)Automatic Parallelization of Python Programs for Distributed Heterogeneous ComputingEuro-Par 2022: Parallel Processing10.1007/978-3-031-12597-3_22(350-366)Online publication date: 1-Aug-2022
    • (2021)Distributed and Heterogeneous SAR Backprojection with Halide2021 IEEE High Performance Extreme Computing Conference (HPEC)10.1109/HPEC49654.2021.9622855(1-9)Online publication date: 20-Sep-2021

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media