Abstract
Parallel programming remains a daunting challenge, from struggling to express a parallel algorithm without cluttering the underlying synchronous logic to describing which tools to employ to ensure a calculation is performed correctly. Over the years, numerous solutions have arisen, requiring new programming languages, extensions to programming languages, or adding pragmas. Support for these various tools and extensions is available to varying degrees. In recent years, the C++ standards committee has worked to refine the language features and libraries needed to support parallel programming on a single computational node. Eventually, all major vendors and compilers will provide robust and performant implementations of these standards. Until then, the HPX library and runtime provide cutting-edge implementations of the standards and proposed standards and extensions. Because of these advances, it is now possible to write high performance parallel code without custom extensions to C++. We provide an overview of modern parallel programming in C++, describing the language and library features and providing brief examples of how to use them.
Similar content being viewed by others
References
Butenhof D.R. Programming with POSIX threads (Addison-Wesley Professional, 1997)
C++ Standards Committee, ISO/IEC 14882:2011, Standard for Programming Language C++ (C++11). Tech. rep., ISO/IEC JTC1/SC22/WG21 (the C++ Standards Committee) (2011). https://wg21.link/N3337, last publicly available draft
C++ Standards Committee, ISO/IEC 14882:2014, Standard for Programming Language C++ (C++14). Tech. rep., ISO/IEC JTC1/SC22/WG21 (the C++ Standards Committee) (2011). https://wg21.link/N4296, last publicly available draft
The C++ Standards Committee, ISO International Standard ISO/IEC 14882:2017, Programming Language C++. Tech. rep., Geneva, Switzerland: International Organization for Standardization (ISO). (2017). http://www.open-std.org/jtc1/sc22/wg21
Chandra R, Dagum L, Kohr D, Menon R, Maydan D, McDonald J. Parallel programming in OpenMP (Morgan kaufmann, 2001)
Leiserson C.E. Cilk (Springer US, Boston, MA, 2011), pp. 273–288. https://doi.org/10.1007/978-0-387-09766-4_289.
Edwards HC, et al. Kokkos: enabling manycore performance portability through polymorphic memory access patterns. J Parallel Distrib Comput. 2014;74(12):3202–16.
Kale LV, Krishnan S. In Proceedings of the eighth annual conference on Object-oriented programming systems, languages, and applications, 1993;pp. 91–108.
Wu N, Gonidelis I, Liu S, Fink Z, Gupta N, Mohammadiporshokooh K, Diehl P, Kaiser H, Kale L.V. in Euro-Par 2022: Parallel Processing Workshops: Euro-Par 2022 International Workshops, Glasgow, UK, August 22–26, 2022, Revised Selected Papers (Springer, 2023), 5–16
Chamberlain B.L. et al., Parallel programmability and the chapel language. The International Journal of High Performance Computing Applications 2007;21(3)
Ebcioglu K. et al., in Proceedings of the International Workshop on Language Runtimes, OOPSLA, 30 (Citeseer, 2004)
Zheng Y. et al., in 2014 IEEE 28th International Parallel and Distributed Processing Symposium (IEEE, 2014), 1105–1114
Thoman P. et al., A taxonomy of task-based parallel programming technologies for high-performance computing. The Journal of Supercomputing 2018;74(4)
Kaiser H, et al. HPX - The C++ Standard Library for Parallelism and Concurrency. Journal of Open Source Software. 2020;5(53):2352.
Daiß G, Amini P, Biddiscombe J, Diehl P, Frank J, Huck K, Kaiser H, Marcello D, Pfander D, Pfüger D. in Proceedings of the international conference for high performance computing, networking, storage and analysis 2019: 1–37
Yan J, Kaiser H, Snir M. in Proceedings of the SC ’23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis (Association for Computing Machinery, New York, NY, USA, 2023), SC-W ’23, 1151–11https://doi.org/10.1145/3624062.3624598.
Bonachea D, Hargrove P.H. in International Workshop on Languages and Compilers for Parallel Computing (Springer, 2018), 138–158
Diehl P, Brandt S.R, Kaiser H. Parallel C++ – Efficient and Scalable High-Performance Parallel Programming Using HPX, vol. 1 (Springer Cham, 2024), 240
Marcello DC, Shiber S, De Marco O, Frank J, Clayton GC, Motl PM, Diehl P, Kaiser H. Octo-Tiger: a new, 3D hydrodynamic code for stellar mergers that uses HPX parallelisation. Monthly Notices of the Royal Astronomical Society. 2021. https://doi.org/10.1093/mnras/stab937.
Diehl P, Daiss G, Brandt S, Kheirkhahan A, Kaiser H, Taylor C, Leidel J. in Proceedings of the SC ’23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis (Association for Computing Machinery, New York, NY, USA, 2023), SC-W ’23, 1533–154https://doi.org/10.1145/3624062.3624230.
Diehl P, Dais G, Huck K, Marcello D, Shiber S, Kaiser H, Pfluger D. in 2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) (IEEE Computer Society, Los Alamitos, CA, USA, 2023), 682–69https://doi.org/10.1109/IPDPSW59300.2023.00116. https://doi.ieeecomputersociety.org/10.1109/IPDPSW59300.2023.00116
Diehl P, Daiß G, Marcello D, Huck K, Shiber S, Kaiser H, Frank J, Clayton G.C, Pflüger D. in 2021 IEEE International Conference on Cluster Computing (CLUSTER) (IEEE, 2021), 204–214
Mandelbrot BB. Fractal aspects of the iteration of \(z \rightarrow \lambda z \) (1-z) for complex \(\lambda \) and z. Ann N Y Acad Sci. 1980;357(1):249–59.
Brooks R, Matelski J.P. in Riemann surfaces and related topics: Proceedings of the 1978 Stony Brook Conference, 1 (Princeton University Press Princeton, New Jersey, 1981)
Dominiak M. et al. std::execution (2022). https://wg21.link/p2300
Grubel P. et al., in 2015 IEEE International Conference on Cluster Computing (IEEE, 2015),682–689
Shirzad S. et al., in 2019 IEEE/ACM Workshop on Machine Learning in High Performance Computing Environments (MLHPC) (IEEE, 2019) 31–43
Khatami Z. et al., in Proceedings of the Third International Workshop on Extreme Scale Programming Models and Middleware 2017:1–8
Yadav S. et al., in 2021 IEEE/ACM 6th International Workshop on Extreme Scale Programming Models and Middleware (ESPM2) 2021: 20–29
Kretz M, Lindenstruth V. Vc: A C++ library for explicit vectorization. Software: Practice and Experience. 2012;42(11):1409–30.
Diehl P, Brandt S.R, Morris M, Gupta N, Kaiser H. Benchmarking the parallel 1d heat equation solver in chapel, charm++, c++, hpx, go, julia, python, rust, swift, and java 2023.
Gamblin T, LeGendre M, Collette M.R, Lee G.L, Moody A, de Supinski B.R, Futral S. in SC15: International Conference for High-Performance Computing, Networking, Storage and Analysis (IEEE Computer Society, Los Alamitos, CA, USA, 2015), 1–1https://doi.org/10.1145/2807591.2807623. https://doi.ieeecomputersociety.org/10.1145/2807591.2807623
Diehl P, Daiß G, Huck K.A, Marcello D, Shiber S, Kaiser H, Pflüger D in IEEE International Parallel and Distributed Processing Symposium, IPDPS 2023 - Workshops, St. Petersburg, FL, USA, May 15-19, 2023 (IEEE, 2023), 682–69https://doi.org/10.1109/IPDPSW59300.2023.00116.
Acknowledgements
We would also like to thank Alireza Kheirkhahan and the HPC admins who support the Deep Bayou cluster at Louisiana State University.
Funding
The authors would like to thank Stony Brook Research Computing and Cyberinfrastructure, and the Institute for Advanced Computational Science at Stony Brook University for access to the innovative high-performance Ookami computing system, which was made possible by a $5 M National Science Foundation grant (#1927880). Data availability The code for all examples is available on GitHub®Footnote 2 or Zenodo™,Footnote 3 respectively.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no Conflict of interest.
Research involving human participants and/or animals
This article does not contain any studies with human participants performed by any of the authors.
Informed consent
Not applicable, since no humans were involved in our research.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article is part of the topical collection “Applications and Frameworks using the Asynchronous Many Task Paradigm” guest edited by Patrick Diehl, Hartmut Kaiser, Peter Thoman, Steven R. Brandt and Ram Ramanujam.
Additional Code Listing
Additional Code Listing
The implementation of the Mandelbrot set is shown for futures in Listing 7, coroutines in Listing 8, parallel algorithms in Listing 9, and senders and receivers in Listing 10, respectively.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Diehl, P., Brandt, S.R. & Kaiser, H. Shared Memory Parallelism in Modern C++ and HPX. SN COMPUT. SCI. 5, 459 (2024). https://doi.org/10.1007/s42979-024-02769-6
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-024-02769-6