Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3447818.3460360acmconferencesArticle/Chapter ViewAbstractPublication PagesicsConference Proceedingsconference-collections
research-article

NPBench: a benchmarking suite for high-performance NumPy

Published: 04 June 2021 Publication History
  • Get Citation Alerts
  • Abstract

    Python, already one of the most popular languages for scientific computing, has made significant inroads in High Performance Computing (HPC). At the center of Python's ecosystem is NumPy, an efficient implementation of the multi-dimensional array (tensor) structure, together with basic arithmetic and linear algebra. Compared to traditional HPC languages, the relatively low performance of Python and NumPy has spawned significant research in compilers and frameworks that decouple Python's compact representation from the underlying implementation. However, it is challenging to compare language compatibility and performance among different frameworks and architectures without a standard set of benchmarks and metrics. To that end, we introduce NPBench, a set of NumPy code samples representing a large variety of HPC applications. We use NPBench to test popular NumPy-accelerating compilers and frameworks on a variety of metrics. NPBench will guide both end-users and framework developers focusing on performance and will drive further use of Python in the high-performance scientific domains.

    References

    [1]
    The SciPy community. [n.d.]. Ndarray Indexing. https://numpy.org/devdocs/reference/arrays.indexing.html Retrieved 2021-02-04 from
    [2]
    Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. http://tensorflow.org/ Software available from tensorflow.org.
    [3]
    Anaconda Inc. [n.d.]a. A 5 minute guide to Numba. https://numba.readthedocs.io/en/stable/user/5minguide.html Retrieved 2021-02-04 from
    [4]
    Anaconda Inc. [n.d.]b. airspeed velocity of an unladen numba. http://numba.pydata.org/numba-benchmark/ Retrieved 2021-01-28 from
    [5]
    Anaconda Inc. [n.d.]c. Example: Histogram. https://numba.pydata.org/numba-examples/examples/density_estimation/histogram/results.html Retrieved 2021-02-04 from
    [6]
    Krste Asanovic, Rastislav Bodik, James Demmel, Tony Keaveny, Kurt Keutzer, John Kubiatowicz, Nelson Morgan, David Patterson, Koushik Sen, John Wawrzynek, David Wessel, and Katherine Yelick. 2009. A View of the Parallel Computing Landscape. Commun. ACM 52, 10 (Oct. 2009), 56--67. 0001-0782
    [7]
    David H. Bailey. 2011. NAS Parallel Benchmarks. Springer US, Boston, MA, 1254--1259.
    [8]
    M. Baldauf, A. Seifert, J. Förstner, D. Majewski, and M. Raschendorfer. 2011. Operational convective-scale numerical weather prediction with the COSMO model: Description and sensitivities. Monthly Weather Review, 139:3387--3905 (2011).
    [9]
    Lorena Barba and Gilbert Forsyth. 2019. CFD Python: the 12 steps to Navier-Stokes equations. Journal of Open Source Education 2, 16 (2019), 21.
    [10]
    Stefan Behnel, Robert Bradshaw, Craig Citro, Lisandro Dalcin, Dag Sverre Seljebotn, and Kurt Smith. 2011. Cython: The best of both worlds. Computing in Science & Engineering 13, 2 (2011), 31--39.
    [11]
    Tal Ben-Nun, Johannes de Fine Licht, Alexandros Nikolaos Ziogas, Timo Schneider, and Torsten Hoefler. 2019. Stateful Dataflow Multigraphs: A Data-Centric Model for Performance Portability on Heterogeneous Architectures. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC '19).
    [12]
    Gabriel Bengtsson. [n.d.]. Development of Stockham Fast Fourier Transform using Data-Centric Parallel Programming. Ph.D. Dissertation. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-287731
    [13]
    Sharan Chetlur, Cliff Woolley, Philippe Vandermersch, Jonathan Cohen, John Tran, Bryan Catanzaro, and Evan Shelhamer. 2014. cuDNN: Efficient Primitives for Deep Learning. CoRR abs/1410.0759 (2014). [arxiv]1410.0759 http://arxiv.org/abs/1410.0759
    [14]
    COSMO. 1998. Consortium for Small-scale Modeling. http://www.cosmo-model.org Retrieved 2021-02-04 from
    [15]
    Cython. [n.d.]. Cython Demos/benchmarks. https://github.com/cython/cython/tree/master/Demos/benchmarks Retrieved 2021-01-28 from
    [16]
    Dask Development Team. 2016. Dask: Library for dynamic task scheduling. https://dask.org
    [17]
    Bradley Efron. 1992. Bootstrap methods: another look at the jackknife. In Breakthroughs in statistics. Springer, 569--593.
    [18]
    Python Software Foundation. [n.d.]. PEP 8 -- Style Guide for Python Code. https://www.python.org/dev/peps/pep-0008
    [19]
    GitHub. 2020. The 2020 State of the Octoverse. https://octoverse.github.com/
    [20]
    Serge Guelton. [n.d.]. Numpy Benchmarks. https://github.com/serge-sans-paille/numpy-benchmarks Retrieved 2021-01-28 from
    [21]
    Serge Guelton. [n.d.]. Pythran. https://github.com/serge-sans-paille/pythran.
    [22]
    Serge Guelton, Pierrick Brunet, Mehdi Amini, Adrien Merlini, Xavier Corbillon, and Alan Raynaud. 2015. Pythran: Enabling static optimization of scientific python programs. Computational Science & Discovery 8, 1 (2015), 014001.
    [23]
    Charles R. Harris, K. Jarrod Millman, St'efan J. van der Walt, Ralf Gommers, Pauli Virtanen, David Cournapeau, Eric Wieser, Julian Taylor, Sebastian Berg, Nathaniel J. Smith, Robert Kern, Matti Picus, Stephan Hoyer, Marten H. van Kerkwijk, Matthew Brett, Allan Haldane, Jaime Fern'andez del R'ıo, Mark Wiebe, Pearu Peterson, Pierre G'erard-Marchant, Kevin Sheppard, Tyler Reddy, Warren Weckesser, Hameer Abbasi, Christoph Gohlke, and Travis E. Oliphant. 2020. Array programming with NumPy. Nature 585, 7825 (Sept. 2020), 357--362.
    [24]
    K. He, X. Zhang, S. Ren, and J. Sun. 2016. Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 770--778.
    [25]
    Michael A Heroux, Douglas W Doerfler, Paul S Crozier, James M Willenbring, H Carter Edwards, Alan Williams, Mahesh Rajan, Eric R Keiter, Heidi K Thornquist, and Robert W Numrich. 2009. Improving Performance via Mini-applications. Technical Report SAND2009-5574. Sandia National Laboratories.
    [26]
    Torsten Hoefler and Roberto Belli. 2015. Scientific Benchmarking of Parallel Computing Systems: Twelve Ways to Tell the Masses When Reporting Performance Results. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (Austin, Texas) (SC '15). Association for Computing Machinery, New York, NY, USA, Article 73, 12 pages.
    [27]
    Intel Corporation. [n.d.]. oneAPI Deep Neural Network Library (oneDNN). https://github.com/oneapi-src/oneDNN Retrieved 2021-02-01 from
    [28]
    Jérôme Kieffer and Giannis Ashiotis. 2014. PyFAI: a Python library for high performance azimuthal integration on GPU. In Proceedings of the 7th European Conference on Python in Science (EuroSciPy 2014). [arxiv]1412.6367 [astro-ph.IM]
    [29]
    Thomas Kluyver, Benjamin Ragan-Kelley, Fernando Pérez, Brian Granger, Matthias Bussonnier, Jonathan Frederic, Kyle Kelley, Jessica Hamrick, Jason Grout, Sylvain Corlay, Paul Ivanov, Damián Avila, Safia Abdalla, Carol Willing, and Jupyter development team. 2016. Jupyter Notebooks - a publishing format for reproducible computational workflows. In Positioning and Power in Academic Publishing: Players, Agents and Agendas, Fernando Loizides and Birgit Scmidt (Eds.). IOS Press, Netherlands, 87--90. https://eprints.soton.ac.uk/403913/
    [30]
    Siu Kwan Lam, Antoine Pitrou, and Stanley Seibert. 2015. Numba: A LLVM-Based Python JIT Compiler. In Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC (Austin, Texas) (LLVM '15). Association for Computing Machinery, New York, NY, USA, Article 7, 6 pages.
    [31]
    Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel. 1989. Backpropagation Applied to Handwritten Zip Code Recognition. Neural Computation 1, 4 (1989), 541--551. https://doi.org/10.1162/neco.1989.1.4.541
    [32]
    Aaron Meurer, Christopher P. Smith, Mateusz Paprocki, Ondřej Čertík, Sergey B. Kirpichev, Matthew Rocklin, AMiT Kumar, Sergiu Ivanov, Jason K. Moore, Sartaj Singh, Thilina Rathnayake, Sean Vig, Brian E. Granger, Richard P. Muller, Francesco Bonazzi, Harsh Gupta, Shivam Vats, Fredrik Johansson, Fabian Pedregosa, Matthew J. Curry, Andy R. Terrel, Štěpán Roučka, Ashutosh Saboo, Isuru Fernando, Sumith Kulal, Robert Cimrman, and Anthony Scopatz. 2017. SymPy: symbolic computing in Python. PeerJ Computer Science 3 (Jan. 2017), e103. 2376-5992
    [33]
    Naveen Michaud-Agrawal, Elizabeth J. Denning, Thomas B. Woolf, and Oliver Beckstein. 2011. MDAnalysis: A toolkit for the analysis of molecular dynamics simulations. Journal of Computational Chemistry 32, 10 (2011), 2319--2327. https://onlinelibrary.wiley.com/doi/pdf/10.1002/jcc.21787
    [34]
    Philip Mocz. 2020. nbody-python: Create Your Own N-body Simulation (With Python). https://github.com/pmocz/nbody-python.
    [35]
    Ryosuke Okuta, Yuya Unno, Daisuke Nishino, Shohei Hido, and Crissman Loomis. 2017. CuPy: A NumPy-Compatible Library for NVIDIA GPU Calculations. In Proceedings of Workshop on Machine Learning Systems (LearningSys) in The Thirty-first Annual Conference on Neural Information Processing Systems (NIPS). http://learningsys.org/nips17/assets/papers/paper_16.pdf
    [36]
    Øystein Sture. [n.d.]. Implementation of crc16 (CRC-16-CCITT) in python. https://gist.github.com/oysstu/68072c44c02879a2abf94ef350d1c7c6 Retrieved 2021-02-04 from
    [37]
    Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. dAlché-Buc, E. Fox, and R. Garnett (Eds.). Curran Associates, Inc., 8024--8035. http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf
    [38]
    Louis-Noël Pouchet et al. 2012. Polybench: The polyhedral benchmark suite. URL: http://www. cs. ucla. edu/pouchet/software/polybench 437 (2012).
    [39]
    PyPy. [n.d.]. Benchmarks. https://foss.heptapod.net/pypy/benchmarks.
    [40]
    Python Software Foundation. [n.d.]. CPython. https://github.com/python/cpython.
    [41]
    Richard J. Gowers, Max Linke, Jonathan Barnoud, Tyler J. E. Reddy, Manuel N. Melo, Sean L. Seyler, Jan Domański, David L. Dotson, Sébastien Buchoux, Ian M. Kenney, and Oliver Beckstein. 2016. MDAnalysis: A Python Package for the Rapid Analysis of Molecular Dynamics Simulations. In Proceedings of the 15th Python in Science Conference, Sebastian Benthall and Scott Rostrup (Eds.). 98 -- 105.
    [42]
    Armin Rigo and Samuele Pedroni. 2006. PyPy's Approach to Virtual Machine Construction. In Companion to the 21st ACM SIGPLAN Symposium on Object-Oriented Programming Systems, Languages, and Applications (Portland, Oregon, USA) (OOPSLA '06). Association for Computing Machinery, New York, NY, USA, 944--953.
    [43]
    Nicolas P. Rougier. 2016. rougier/from-python-to-numpy: Version 1.1. Zenodo.
    [44]
    Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. 2015. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision 115, 3 (01 Dec 2015), 211--252. 1573-1405
    [45]
    Scalable Parallel Computing Lab. [n.d.]. DaCe - Data-Centric Parallel Programming. https://github.com/spcl/dace.
    [46]
    Serge Guelton, Pierrick Brunet et al. [n.d.]. Pythran. https://pythran.readthedocs.io/ Retrieved 2021-02-04 from
    [47]
    Stefan Behnel, Robert Bradshaw, Dag Sverre Seljebotn, Greg Ewing, William Stein, Gabriel Gellner, et al. [n.d.]. Cython for NumPy users. https://cython.readthedocs.io/en/latest/src/userguide/numpy_tutorial.html Retrieved 2021-02-04 from
    [48]
    Christian Stieger, Aron Szabo, Teutë Bunjaku, and Mathieu Luisier. 2017. Ab-initio quantum transport simulation of self-heating in single-layer 2-D materials. Journal of Applied Physics 122, 4 (2017), 045708. https://doi.org/10.1063/1.4990384
    [49]
    Victor Stinner. 2017. The Python Performance Benchmark Suite. https://pyperformance.readthedocs.io/ Retrieved 2021-01-28 from
    [50]
    Swiss National Supercomputing Centre (CSCS). [n.d.]. GT4Py. https://github.com/GridTools/gt4py Retrieved 2021-02-01 from
    [51]
    Pauli Virtanen, Ralf Gommers, Travis E. Oliphant, Matt Haberland, Tyler Reddy, David Cournapeau, Evgeni Burovski, Pearu Peterson, Warren Weckesser, Jonathan Bright, Stéfan J. van der Walt, Matthew Brett, Joshua Wilson, K. Jarrod Millman, Nikolay Mayorov, Andrew R. J. Nelson, Eric Jones, Robert Kern, Eric Larson, C J Carey, İlhan Polat, Yu Feng, Eric W. Moore, Jake VanderPlas, Denis Laxalde, Josef Perktold, Robert Cimrman, Ian Henriksen, E. A. Quintero, Charles R. Harris, Anne M. Archibald, Antônio H. Ribeiro, Fabian Pedregosa, Paul van Mulbregt, and SciPy 1.0 Contributors. 2020. SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nature Methods 17 (2020), 261--272.
    [52]
    David Wheeler. [n.d.]. SLOCCount. https://dwheeler.com/sloccount/ Retrieved 2021-02-04 from
    [53]
    Alexandros Nikolaos Ziogas, Tal Ben-Nun, Guillermo Indalecio Fernández, Timo Schneider, Mathieu Luisier, and Torsten Hoefler. 2019. A Data-Centric Approach to Extreme-Scale Ab Initio Dissipative Quantum Transport Simulations. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (Denver, Colorado) (SC '19). Association for Computing Machinery, New York, NY, USA, Article 1, 13 pages.

    Cited By

    View all
    • (2024)APPy: Annotated Parallelism for Python on GPUsProceedings of the 33rd ACM SIGPLAN International Conference on Compiler Construction10.1145/3640537.3641575(113-125)Online publication date: 17-Feb-2024
    • (2024)Optimized Python library for reconstruction of ensemble-based gene co-expression networks using multi-GPUThe Journal of Supercomputing10.1007/s11227-024-06127-480:12(18142-18176)Online publication date: 1-Aug-2024
    • (2024)Evaluation of Alternatives to Accelerate Scientific Numerical Calculations on Graphics Processing Units Using PythonHigh Performance Computing10.1007/978-3-031-52186-7_1(3-20)Online publication date: 28-Jan-2024
    • Show More Cited By

    Index Terms

    1. NPBench: a benchmarking suite for high-performance NumPy

            Recommendations

            Comments

            Information & Contributors

            Information

            Published In

            cover image ACM Conferences
            ICS '21: Proceedings of the 35th ACM International Conference on Supercomputing
            June 2021
            506 pages
            ISBN:9781450383356
            DOI:10.1145/3447818
            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Sponsors

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            Published: 04 June 2021

            Permissions

            Request permissions for this article.

            Check for updates

            Author Tags

            1. NumPy
            2. Python
            3. benchmark
            4. high performance computing

            Qualifiers

            • Research-article

            Funding Sources

            Conference

            ICS '21
            Sponsor:

            Acceptance Rates

            ICS '21 Paper Acceptance Rate 39 of 157 submissions, 25%;
            Overall Acceptance Rate 629 of 2,180 submissions, 29%

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • Downloads (Last 12 months)134
            • Downloads (Last 6 weeks)4
            Reflects downloads up to 10 Aug 2024

            Other Metrics

            Citations

            Cited By

            View all
            • (2024)APPy: Annotated Parallelism for Python on GPUsProceedings of the 33rd ACM SIGPLAN International Conference on Compiler Construction10.1145/3640537.3641575(113-125)Online publication date: 17-Feb-2024
            • (2024)Optimized Python library for reconstruction of ensemble-based gene co-expression networks using multi-GPUThe Journal of Supercomputing10.1007/s11227-024-06127-480:12(18142-18176)Online publication date: 1-Aug-2024
            • (2024)Evaluation of Alternatives to Accelerate Scientific Numerical Calculations on Graphics Processing Units Using PythonHigh Performance Computing10.1007/978-3-031-52186-7_1(3-20)Online publication date: 28-Jan-2024
            • (2023)A Long Short-Term Memory-Based Prototype Model for Drought PredictionElectronics10.3390/electronics1218395612:18(3956)Online publication date: 20-Sep-2023
            • (2023)FuzzyFlow: Leveraging Dataflow To Find and Squash Program Optimization BugsProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3613214(1-15)Online publication date: 12-Nov-2023
            • (2023)Hay: Enhancing GPU Sharing Performance With Two-Level Scheduling for Ray2023 IEEE 29th International Conference on Parallel and Distributed Systems (ICPADS)10.1109/ICPADS60453.2023.00410(2865-2868)Online publication date: 17-Dec-2023
            • (2023)Simplifying non-contiguous data transfer with MPI for PythonThe Journal of Supercomputing10.1007/s11227-023-05398-779:17(20019-20040)Online publication date: 7-Jun-2023
            • (2022)Boosting performance optimization with interactive data movement visualizationProceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis10.5555/3571885.3571970(1-16)Online publication date: 13-Nov-2022
            • (2022)Boosting Performance Optimization with Interactive Data Movement VisualizationSC22: International Conference for High Performance Computing, Networking, Storage and Analysis10.1109/SC41404.2022.00069(1-16)Online publication date: Nov-2022
            • (2022)NAS Parallel Benchmark Kernels with Python: A performance and programming effort analysis focusing on GPUs2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)10.1109/PDP55904.2022.00013(26-33)Online publication date: Mar-2022
            • Show More Cited By

            View Options

            Get Access

            Login options

            View options

            PDF

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader

            Media

            Figures

            Other

            Tables

            Share

            Share

            Share this Publication link

            Share on social media