Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

hyper.deal: An Efficient, Matrix-free Finite-element Library for High-dimensional Partial Differential Equations

Published: 28 September 2021 Publication History

Abstract

This work presents the efficient, matrix-free finite-element library hyper.deal for solving partial differential equations in two up to six dimensions with high-order discontinuous Galerkin methods. It builds upon the low-dimensional finite-element library deal.II to create complex low-dimensional meshes and to operate on them individually. These meshes are combined via a tensor product on the fly, and the library provides new special-purpose highly optimized matrix-free functions exploiting domain decomposition as well as shared memory via MPI-3.0 features. Both node-level performance analyses and strong/weak-scaling studies on up to 147,456 CPU cores confirm the efficiency of the implementation. Results obtained with the library hyper.deal are reported for high-dimensional advection problems and for the solution of the Vlasov–Poisson equation in up to six-dimensional phase space.

References

[1]
Mark Adams, Marian Brezina, Jonathan Hu, and Ray Tuminaro. 2003. Parallel multigrid smoothing: polynomial versus Gauss–Seidel. J. Comput. Phys. 188, 2 (2003), 593–610.
[2]
Martin S. Alnæs, Jan Blechta, Johan Hake, August Johansson, Benjamin Kehlet, Anders Logg, Chris Richardson, Johannes Ring, Marie E. Rognes, and Garth N. Wells. 2015. The FEniCS Project Version 1.5. Arch. Numer. Soft. 3, 100 (2015), 9–23.
[3]
Robert Anderson, Julian Andrej, Andrew Barker, Jamie Bramwell, Jean-Sylvain Camier, Jakub Cerveny, Veselin Dobrev, Yohann Dudouit, Aaron Fisher, Tzanio Kolev, Will Pazner, Mark Stowell, Vladimir Tomov, Ido Akkerman, Johann Dahm, David Medina, and Stefano Zampini. 2021. MFEM: A modular finite element methods library. Comput. Math. Appl. 81 (2021), 42–74.
[4]
Daniel Arndt, Wolfgang Bangerth, Bruno Blais, Thomas C. Clevenger, Marc Fehling, Alexander V. Grayver, Timo Heister, Luca Heltai, Martin Kronbichler, Matthias Maier, Peter Munch, Jean-Paul Pelteret, Reza Rastak, Ignacio Thomas, Bruno Turcksin, Zhuoran Wang, and David Wells. 2020. The deal.II Library, Version 9.2. J. Numer. Math. 28, 3 (2020), 131–146.
[5]
Daniel Arndt, Wolfgang Bangerth, Denis Davydov, Timo Heister, Luca Heltai, Martin Kronbichler, Matthias Maier, Jean-Paul Pelteret, Bruno Turcksin, and David Wells. 2021. The deal.II finite element library: Design, features, and insights. Comput. Math. Appl. 81 (2021), 407–422.
[6]
Markus Bachmayr, Reinhold Schneider, and André Uschmajew. 2016. Tensor networks and hierarchical tensors for the solution of high-dimensional partial differential equations. Found. Comput. Math. 16, 6 (2016), 1423–1472.
[7]
Wolfgang Bangerth, Carsten Burstedde, Timo Heister, and Martin Kronbichler. 2011. Algorithms and data structures for massively parallel generic adaptive finite element codes. ACM Trans. Math. Software 38, 2 (2011), 28 pages.
[8]
Peter Bastian, Christian Engwer, Jorrit Fahlke, Markus Geveler, Dominik Göddeke, Oleg Iliev, Olaf Ippisch, René Milk, Jan Mohring, Steffen Müthing, Mario Ohlberger, Dirk Ribbrock, and Stefan Turek. 2016. Hardware-based efficiency advances in the EXA-DUNE Project. In Software for Exascale Computing – SPPEXA 2013–2015, Hans-Joachim Bungartz, Peter Neumann, and Wolfgang E. Nagel (Eds.). Springer International Publishing, Cham, 3–23.
[9]
Gheorghe-Teodor Bercea, Andrew T. T. McRae, David A. Ham, Lawrence Mitchell, Florian Rathgeber, Luigi Nardi, Fabio Luporini, and Paul H. J. Kelly. 2016. A structure-exploiting numbering algorithm for finite elements on extruded meshes, and its performance evaluation in Firedrake. Geosci. Model Dev. 9, 10 (2016), 3803–3815.
[10]
Carsten Burstedde, Lucas C. Wilcox, and Omar Ghattas. 2011. p4est : Scalable algorithms for parallel adaptive mesh refinement on forests of octrees. SIAM J. Sci. Comput. 33, 3 (2011), 1103–1133.
[11]
Denis Davydov, Jean-Paul Pelteret, Daniel Arndt, Martin Kronbichler, and Paul Steinmann. 2020. A matrix-free approach for finite-strain hyperelastic problems using geometric multigrid. Internat. J. Numer. Methods Engrg. 121, 13 (2020), 2874–2895.
[12]
Andreas Dedner, Robert Klöfkorn, Martin Nolte, and Mario Ohlberger. 2010. A generic interface for parallel and adaptive scientific computing: Abstraction principles and the DUNE-FEM module. Computing 90 (2010), 165–196.
[13]
Michel O. Deville, Paul F. Fischer, and Ernest H. Mund. 2002. High-Order Methods for Incompressible Fluid Flow. Vol. 9. Cambridge University Press, Cambridge.
[14]
Niklas Fehn, Peter Munch, Wolfgang A. Wall, and Martin Kronbichler. 2020. Hybrid multigrid methods for high-order discontinuous Galerkin discretizations. J. Comput. Phys. 415 (2020), 109538.
[15]
Francis Filbet and Eric Sonnendrücker. 2003. Comparison of Eulerian Vlasov solvers. Comput. Phys. Communic. 150, 3 (2003), 247–266.
[16]
William J. Gordon and Linda C. Thiel. 1982. Transfinite mappings and their application to grid generation. Appl. Math. Comput. 10 (1982), 171–233.
[17]
Wei Guo and Yingda Cheng. 2016. A sparse grid discontinuous Galerkin method for high-dimensional transport equations and its application to kinetic simulations. SIAM J. Sci. Comput. 38, 6 (2016), A3381–A3409.
[18]
Ammar Hakim, Greg Hammett, Eric L. Shi, and Noah Mandell. 2019. Discontinuous Galerkin schemes for a class of hamiltonian evolution equations with applications to plasma fluid and kinetic problems. arXiv 1908.01814 (2019).
[19]
Luca Heltai, Wolfgang Bangerth, Martin Kronbichler, and Andrea Mola. 2021. Propagating geometry information to finite element computations. ACM Trans. Math. Softw. 47, 4, Article 32 (2021), 30 pages. https://doi.org/10.1145/3468428.
[20]
Michael A. Heroux, Eric T. Phipps, Andrew G. Salinger, Heidi K. Thornquist, Ray S. Tuminaro, James M. Willenbring, Alan Williams, Kendall S. Stanley, Roscoe A. Bartlett, Vicki E. Howle, Robert J. Hoekstra, Jonathan J. Hu, Tamara G. Kolda, Richard B. Lehoucq, Kevin R. Long, and Roger P. Pawlowski. 2005. An overview of the Trilinos project. ACM Trans. Math. Software 31, 3 (2005), 397–423.
[21]
J. S. Hesthaven and T. Warburton. 2008. Nodal Discontinuous Galerkin Methods: Algorithms, Analysis, and Applications. Springer, New York.
[22]
James Juno, Ammar Hakim, Jason TenBarge, Eric L. Shi, and William Dorland. 2018. Discontinuous Galerkin algorithms for fully kinetic plasmas. J. Comput. Phys. 353 (2018), 110–147.
[23]
George Karypis and Vipin Kumar. 1998. METIS: A Software Package for Partitioning Unstructured Graphs, Partitioning Meshes, and Computing Fill-Reducing Orderings of Sparse Matrices.
[24]
Dominic Kempf, René Heß, Steffen Müthing, and Peter Bastian. 2021. Automatic code generation for high-performance discontinuous Galerkin methods on modern architectures. ACM Trans. Math. Software 47, 1 (2021), 31 pages.
[25]
Christopher A. Kennedy, Mark H. Carpenter, and R. Michael Lewis. 2000. Low-storage, explicit Runge–Kutta schemes for the compressible Navier–Stokes equations. Appl. Numer. Math. 35, 3 (2000), 177–219.
[26]
David A. Kopriva and Gregor J. Gassner. 2014. An energy stable discontinuous Galerkin spectral element discretization for variable coefficient advection problems. SIAM J. Sci. Comput. 36, 4 (2014), A2076–A2099.
[27]
Katharina Kormann, Klaus Reuter, and Markus Rampp. 2019. A massively parallel semi-Lagrangian solver for the six-dimensional Vlasov–Poisson equation. Int. J. High Perform. Comput. Appl. 33, 5 (2019), 924–947.
[28]
Benjamin Krank, Niklas Fehn, Wolfgang A. Wall, and Martin Kronbichler. 2017. A high-order semi-explicit discontinuous Galerkin solver for 3D incompressible flow with application to DNS and LES of turbulent channel flow. J. Comput. Phys. 348 (2017), 634–659.
[29]
Martin Kronbichler and Katharina Kormann. 2012. A generic interface for parallel cell-based finite element operator application. Comput. Fluids 63 (2012), 135–147.
[30]
Martin Kronbichler and Katharina Kormann. 2019. Fast matrix-free evaluation of discontinuous Galerkin finite element operators. ACM Trans. Math. Software 45, 3 (2019), 40 pages.
[31]
Martin Kronbichler, Katharina Kormann, Niklas Fehn, Peter Munch, and Julius Witte. 2019. A Hermite-like basis for faster matrix-free evaluation of interior penalty discontinuous Galerkin operators. arXiv preprint arXiv:1907.08492 (2019).
[32]
Martin Kronbichler, Svenja Schoeder, Christopher Müller, and Wolfgang A. Wall. 2016. Comparison of implicit and explicit hybridizable discontinuous Galerkin methods for the acoustic wave equation. Internat. J. Numer. Methods Engrg. 106, 9 (2016), 712–739.
[33]
Martin Kronbichler and Wolfgang A. Wall. 2018. A performance comparison of continuous and discontinuous Galerkin methods with fast multigrid solvers. SIAM J. Sci. Comput. 40, 5 (2018), A3423–A3448.
[34]
J. Markus Melenk, Klaus Gerdes, and Christoph Schwab. 2001. Fully discrete hp-finite elements: Fast quadrature. Comput. Methods Appl. Mech. Eng. 190, 32 (2001), 4339–4364.
[35]
Steffen Müthing, Marian Piatkowski, and Peter Bastian. 2017. High-performance implementation of matrix-free high-order discontinuous Galerkin methods. arXiv preprint arXiv:1711.10885 (2017).
[36]
Steven A. Orszag. 1980. Spectral methods for problems in complex geometries. J. Comput. Phys. 37, 1 (1980), 70–92.
[37]
Bram Reps and Tobias Weinzierl. 2017. Complex additive geometric multilevel solvers for Helmholtz equations on spacetrees. ACM Trans. Math. Software 44, 1 (2017), 1–36.
[38]
Thomas Roehl, Jan Treibig, Georg Hager, and Gerhard Wellein. 2014. Overhead analysis of performance counter measurements. In 2014 43rd International Conference on Parallel Processing Workshops, Vol. 2015-May. IEEE, Minneapolis, Minnesota, 176–185.
[39]
Svenja Schoeder, Katharina Kormann, Wolfgang A. Wall, and Martin Kronbichler. 2018. Efficient explicit time stepping of high order discontinuous Galerkin schemes for waves. SIAM J. Sci. Comput. 40, 6 (2018), C803–C826.
[40]
Tianjiao Sun, Lawrence Mitchell, Kaushik Kulkarni, Andreas Klöckner, David A. Ham, and Paul H. J. Kelly. 2020. A study of vectorization for matrix-free finite element methods. Int. J. High Perf. Comput. Appl. 34, 6 (2020), 629–644.
[41]
Jan Treibig, Georg Hager, and Gerhard Wellein. 2010. LIKWID: A lightweight performance-oriented tool suite for x86 multicore environments. In 2010 39th International Conference on Parallel Processing Workshops, Wang-Chien Lee (Ed.). IEEE, Piscataway, NJ, 207–216.
[42]
Takayuki Umeda, Keiichiro Fukazawa, Yasuhiro Nariyuki, and Tatsuki Ogino. 2012. A scalable full-electromagnetic vlasov solver for cross-scale coupling in space plasma. IEEE T. Plasma Sci. 40, 5 (2012), 1421–1428.
[43]
Robert A. Van De Geijn and Jerrell Watts. 1997. SUMMA: Scalable universal matrix multiplication algorithm. Concurrency–Pract. Ex. 9, 4 (1997), 255–274.
[44]
Tobias Weinzierl. 2019. The peano software—parallel, automaton-based, dynamically adaptive grid traversals. ACM Trans. Math. Software 45, 2 (2019), 41 pages.
[45]
Samuel Williams, Andrew Waterman, and David Patterson. 2009. Roofline: An insightful visual performance model for multicore architectures. Commun. ACM 52, 4 (2009), 65.

Cited By

View all
  • (2024)Cache-optimized and low-overhead implementations of additive Schwarz methods for high-order FEM multigrid computationsInternational Journal of High Performance Computing Applications10.1177/1094342023121722138:3(192-209)Online publication date: 15-May-2024
  • (2024)Matrix-Free Monolithic Multigrid Methods for Stokes and Generalized Stokes ProblemsSIAM Journal on Scientific Computing10.1137/22M150418446:3(A1599-A1627)Online publication date: 9-May-2024
  • (2024)On the construction of an efficient finite-element solver for phase-field simulations of many-particle solid-state-sintering processesComputational Materials Science10.1016/j.commatsci.2023.112589231(112589)Online publication date: Jan-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Mathematical Software
ACM Transactions on Mathematical Software  Volume 47, Issue 4
December 2021
242 pages
ISSN:0098-3500
EISSN:1557-7295
DOI:10.1145/3485138
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 September 2021
Accepted: 01 June 2021
Revised: 01 May 2021
Received: 01 February 2020
Published in TOMS Volume 47, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Matrix-free operator evaluation
  2. discontinuous Galerkin methods
  3. high-dimensional
  4. high-order
  5. Vlasov–Poisson equation
  6. MPI-3.0 shared memory

Qualifiers

  • Research-article
  • Refereed

Funding Sources

  • German Research Foundation (DFG)

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)24
  • Downloads (Last 6 weeks)2
Reflects downloads up to 12 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Cache-optimized and low-overhead implementations of additive Schwarz methods for high-order FEM multigrid computationsInternational Journal of High Performance Computing Applications10.1177/1094342023121722138:3(192-209)Online publication date: 15-May-2024
  • (2024)Matrix-Free Monolithic Multigrid Methods for Stokes and Generalized Stokes ProblemsSIAM Journal on Scientific Computing10.1137/22M150418446:3(A1599-A1627)Online publication date: 9-May-2024
  • (2024)On the construction of an efficient finite-element solver for phase-field simulations of many-particle solid-state-sintering processesComputational Materials Science10.1016/j.commatsci.2023.112589231(112589)Online publication date: Jan-2024
  • (2023)Enhancing data locality of the conjugate gradient method for high-order matrix-free finite-element implementationsInternational Journal of High Performance Computing Applications10.1177/1094342022110788037:2(61-81)Online publication date: 1-Mar-2023
  • (2023)Efficient Distributed Matrix-free Multigrid Methods on Locally Refined Meshes for FEM ComputationsACM Transactions on Parallel Computing10.1145/358031410:1(1-38)Online publication date: 29-Mar-2023
  • (2023)Stage-Parallel Fully Implicit Runge–Kutta Implementations with Optimal Multilevel Preconditioners at the Scaling LimitSIAM Journal on Scientific Computing10.1137/22M150327046:2(S71-S96)Online publication date: 18-Jul-2023
  • (2022)Extending FEniCS to work in higher dimensions using tensor product finite elementsJournal of Computational Science10.1016/j.jocs.2022.10183164(101831)Online publication date: Oct-2022
  • (2022)Efficient Application of Hanging-Node Constraints for Matrix-Free High-Order FEM Computations on CPU and GPUHigh Performance Computing10.1007/978-3-031-07312-0_7(133-152)Online publication date: 29-May-2022
  • (2020)ExaDG: High-Order Discontinuous Galerkin for the Exa-ScaleSoftware for Exascale Computing - SPPEXA 2016-201910.1007/978-3-030-47956-5_8(189-224)Online publication date: 31-Jul-2020

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media