research-article

On the arithmetic intensity of high-order finite-volume discretizations for hyperbolic systems of conservation laws

Authors:

JAF HittingerAuthors Info & Claims

International Journal of High Performance Computing Applications, Volume 33, Issue 1

Pages 25 - 52

https://doi.org/10.1177/1094342017691876

Published: 01 January 2019 Publication History

Abstract

It has been conjectured that higher-order discretizations for partial differential equations will have advantages over the lower-order counterparts commonly used today. The reasoning is that the increase in arithmetic operations will be more than offset by the reduction in data transfers and the increase in concurrent floating-point units. To evaluate this conjecture, the arithmetic intensity of a class of high-order finite-volume discretizations for hyperbolic systems of conservation laws is theoretically analyzed for spatial discretizations from orders three through eight in arbitrary dimensions. Three cache models are considered: the limiting cases of no cache and infinite cache as well as a finite-sized cache model. Models are validated experimentally by measuring floating-point operations and data transfers on an IBM Blue Gene/Q node. Theory and experiments demonstrate that high-order finite-volume methods will be able to provide increases in arithmetic intensity that will be necessary to make better utilization of on-node floating-point capability.

References

[1]

2012BGPM - BG/Q Hardware Perf Monitoring API. International Business Machines Corporation.

[2]

Adams M, Colella P, Graves D . 2014Chombo software package for AMR applications - design document. Technical report LBNL-6616E, Lawrence Berkeley National Laboratory, CA.

[3]

Ashby S, Beckman P, Chen J . 2010The opportunities and challenges of exascale computing. Technical report, U.S. Department of Energy Advanced Scientific Computing Advisory Committee, Washington, DC.

[4]

Ballard G, Carson E, Demmel J . 2014Communication lower bounds and optimal algorithms for numerical linear algebra. Acta NumericaVolume 23 : pp.1-–155.

[5]

Basu P, Hall M, Williams S . 2015Compiler-directed transformation for higher-order stencils. In: 29th annual IEEE international parallel and distributed processing symposium ed O'Conner L, Hyderabad, India, <conf-date>25-29 May 2015</conf-date>, pp.pp.313-–323. Los Alamitos, CA, USA: IEEE Computer Society Conference Publishing Services.

Digital Library

[6]

Bell J, Almgren A, Beckner V . 2013BoxLib User's Guide. Center for Computational Sciences and Engineering, Lawrence Berkeley National Laboratory, CA.

[7]

Bermejo-Moreno I, Bodart J, Larsson J . 2013Solving the compressible Navier-Stokes equations on up to 1.97 million cores and 4.1 trillion grid points. In: SC `13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, eds Gropp W, Matsuoka S, Denver, CO, <conf-date>17-21 November 2013</conf-date>, pp.pp.62:1-–62:10. New York, NY, USA: ACM.

Digital Library

[8]

Bhatelé A, Wesolowski L, Bohm E . 2010Understanding application performance via micro-benchmarks on three large supercomputers: Intrepid, Ranger and Jaguar. The International Journal of High Performance Computing ApplicationsVolume 24 Issue 4: pp.411-–427.

Digital Library

[9]

Bohr M 2007A 30 year retrospective on Dennard's MOSFET scaling paper. IEEE Solid-State Circuits Society NewsletterVolume 12 Issue 1: pp.11-–13.

[10]

Boris JP, Book DL 1973Flux-corrected transport. I. SHASTA, a fluid transport algorithm that works. Journal of Computational PhysicsVolume 11 : pp.38-–69.

[11]

Callahan D, Cocke J, Kennedy K 1988Estimating interlock and improving balance for pipelined architectures. Journal of Parallel and Distributed ComputingVolume 5 Issue 4: pp.334-–358.

Digital Library

[12]

Carr S, Kennedy K 1994Improving the ratio of memory operations to floating-point operations in loops. ACM Transactions on Programming Languages and SystemsVolume 16 Issue 6: pp.1768-–1810.

Digital Library

[13]

Chaplin C, Colella P 2017A single stage flux-corrected transport algorithm for high-order finite-volume methods. Communications in Applied Mathematics and Computational ScienceVolume arXiv : 1506.02999.

[14]

Chen G, Chacón L, Barnes DC 2012An efficient mixed-precision, hybrid CPU-GPU implementation of a nonlinearly implicit one-dimensional particle-in-cell algorithm. Journal of Computational PhysicsVolume 231 Issue 16: pp.5374-–5388.

Digital Library

[15]

Colella P, Woodward PR 1984The piecewise parabolic method PPM for gas-dynamical simulations. Journal of Computational PhysicsVolume 54 Issue 1: pp.174-–201.

[16]

Colella P, Dorr MR, Hittinger JAF . 2011High-order, finite-volume methods in mapped coordinates. Journal of Computational PhysicsVolume 230 Issue 8: pp.2952-–2976.

Digital Library

[17]

Dongarra J, Hittinger J, Bell J . 2014Applied mathematics research for exascale computing. Technical report, Advanced Scientific Computing Research Program, U.S. Department of Energy Office of Science, Washington, DC.

[18]

Godenschwager C, Schornbaum F, Bauer M . 2013A framework for hybrid parallel flow simulations with a trillion cells in complex geometries. In: SC `13 Proceedings of the international conference on high performance computing, networking, storage and analysis eds Gropp W, Matsuoka S, Denver, CO, USA, <conf-date>17-21 November 2013</conf-date>, pp.pp.35:1-–35:12. New York, NY, USA: ACM.

Digital Library

[19]

Golub G, Van Loan C 2012Matrix Computations Johns Hopkins Studies in the Mathematical Sciences. Baltimore, MD: Johns Hopkins University Press.

[20]

Gottlieb S, Shu CW, Tadmor E 2001Strong stability-preserving high-order time discretization methods. SIAM ReviewVolume 43 Issue 1: pp.89-–112.

Digital Library

[21]

Hejazialhosseini B, Rossinelli D, Conti C . 2012High throughput software for direct numerical simulations of compressible two-phase flows. In: SC `12 Proceedings of the international conference on high performance computing, networking, storage and analysis ed Hollingsworth J, Salt Lake City, UT, USA, <conf-date>11-15 November 2012</conf-date>, pp.pp.1-–12. New York, NY, USA: ACM.

Digital Library

[22]

Hornung RD, Kohn SR 2002Managing application complexity in the SAMRAI object-oriented framework. Concurrency and ComputationVolume 14 : pp.347-–368.

[23]

Idomura Y, Jolliet S 2011Performance evaluations of gyrokinetic Eulerian code GT5D on massively parallel multi-core platforms. In: State of the practice reports, pp. .

Digital Library

[24]

Jiang GS, Shu CW 1996Efficient implementation of weighted ENO schemes. Journal of Computational PhysicsVolume 126 Issue 1: pp.202-–228.

Digital Library

[25]

Jia-Wei H, Kung HT 1981I/O complexity: The red-blue pebble game. In: Proceedings of the thirteenth annual ACM symposium on theory of computing ed Rivest R, Milwaukee, Wisconsin, USA, <conf-date>11-13 May 1981</conf-date>, pp.pp.326-–333. New York, NY, USA: ACM.

Digital Library

[26]

Kreiss HO, Oliger J 1972Comparison of accurate methods for the integration of hyperbolic equations. TellusVolume 24 Issue 3: pp.199-–215.

[27]

Langguth J, Wu N, Chai J . 2013On the GPU performance of cell-centered finite volume method over unstructured tetrahedral meshes. In: Proceedings of the 3rd workshop on irregular applications: Architectures and algorithms eds Tumeo A, Feo J, Villa O, Secchi S., Denver, CO, USA, <conf-date>17-21 November 2013</conf-date>, pp. . New York, NY, USA: ACM.

Digital Library

[28]

Liu XD, Osher S, Chan T 1994Weighted essentially non-oscillatory schemes. Journal of Computational PhysicsVolume 115 Issue 1: pp.200-–212.

Digital Library

[29]

Lucas R, Ang J, Bergman K . 2014Top ten exascale research challenges. Technical report, U.S. Department of Energy Advanced Scientific Computing Advisory Committee, Washington, DC.

[30]

McCorquodale P, Colella P 2011a A high-order finite volume method for conservation laws on locally refined grids. Communications in Applied Mathematics and Computational ScienceVolume 6 Issue 1: pp.1-–25.

[31]

McCorquodale P, Dorr MR, Hittinger JAF . 2015High-order finite-volume methods for hyperbolic conservation laws on mapped multiblock grids. Journal of Computational PhysicsVolume 288 : pp.181-–195.

Digital Library

[32]

Olschanowsky C, Strout MM, Guzik S . 2014A study on balancing parallelism, data locality, and recomputation in existing PDE solvers. In: SC `14 Proceedings of the international conference on high performance computing, networking, storage and analysis eds Damkroger T, Dongarra J, New Orleans, LA, USA, <conf-date>16-21 November 2014</conf-date>, pp.pp.793-–804. New York, NY, USA: ACM.

Digital Library

[33]

Pananilath I, Acharya A, Vasista V . 2015An optimizing code generator for a class of lattice-Boltzmann computations. ACM Transactions on Architecture and Code OptimizationVolume 12 Issue 2: pp.14:1-–14:23.

Digital Library

[34]

Rossinelli D, Hejazialhosseini B, Hadjidoukas P . 201311 pflop/s simulations of cloud cavitation collapse. In: SC `13 Proceedings of the international conference on high performance computing, networking, storage and analysis eds Gropp W, Matsuoka S, Denver, CO, USA, <conf-date>17-21 November 2013</conf-date>, pp.pp.3:1-–3:13. New York, NY, USA: ACM.

Digital Library

[35]

Rossinelli D, Hejazialhosseini B, Spampinato DG . 2011Multicore/multi-GPU accelerated simulations of multiphase compressible flows using wavelet adapted grids. SIAM Journal on Scientific ComputingVolume 33 Issue 2: pp.512-–540.

Digital Library

[36]

Shu CW 1997Essentially non-oscillatory and weighted essentially non-oscillatory schemes for hyperbolic conservation laws. Technical report ICASE-97-65, NASA, Washington, DC.

Digital Library

[37]

Sim J, Dasgupta A, Kim H . 2012A performance analysis framework for identifying potential benefits in GPGPU applications. SIGPLAN NoticesVolume 47 Issue 8: pp.11-–22.

Digital Library

[38]

Stengel H, Treibig J, Hager G . 2015Quantifying performance bottlenecks of stencil computations using the execution-cache-memory model. In: Proceedings of the 29th ACM on international conference on supercomputing eds Bhuyan LN, Chong F, Sarkar V, Newport Beach, CA, USA, <conf-date>8-11 June 2015</conf-date>, pp.pp.207-–216. New York, NY, USA: ACM.

Digital Library

[39]

Williams S, Carter J, Oliker L . 2009a Optimization of a lattice Boltzmann computation on state-of-the-art multicore platforms. Journal of Parallel and Distributed ComputingVolume 69 Issue 9: pp.762-–777.

Digital Library

[40]

Williams S, Oliker L, Carter J . 2011Extracting ultrascale lattice Boltzmann performance via hierarchical and distributed auto-tuning. In: SC `11 Proceedings of the international conference on high performance computing, networking, storage and analysis eds Lathrop S, Costa J, Kramer W Seattle, WA, USA, <conf-date>12-18 November 2011</conf-date>, pp.pp.55:1-–55:12. New York, NY, USA: ACM.

Digital Library

[41]

Williams S, Waterman A, Patterson D 2009b Roofline: An insightful visual performance model for multicore architectures. Communications of the ACMVolume 52 Issue 4: pp.65-–76.

Digital Library

[42]

Yokota R, Barba LA 2012Hierarchical n-body simulations with autotuning for heterogeneous systems. Computing in Science & EngineeringVolume 14 Issue 3: pp.30-–39.

Digital Library

[43]

Zalesak ST 1979Fully multidimensional flux-corrected transport algorithms for fluids. Journal of Computational PhysicsVolume 31 Issue 2: pp.335-–362.

[44]

Zhang W, Wei W, Cai X 2014Performance modeling of serial and parallel implementations of the fractional Adams-Bashforth-Moulton method. Fractional Calculus and Applied AnalysisVolume 17 Issue 3: pp.617-–637.

Recommendations

High Resolution, Entropy-Consistent Scheme Using Flux Limiter for Hyperbolic Systems of Conservation Laws

Existing entropy-consistent Euler flux avoids spurious oscillations and exactly preserves the stationary contact discontinuity but still leaves much room for further improvement in resolution and other applications. In this spirit, we propose a new high ...
An optimal error estimate for upwind Finite Volume methods for nonlinear hyperbolic conservation laws

The purpose of this paper is to show that the cell-centered upwind Finite Volume scheme applied to general hyperbolic systems of m conservation laws approximates smooth solutions to the continuous problem at order one in space and time. As it is now ...
High-Order Multiderivative Time Integrators for Hyperbolic Conservation Laws

Multiderivative time integrators have a long history of development for ordinary differential equations, and yet to date, only a small subset of these methods have been explored as a tool for solving partial differential equations (PDEs). This large ...

Comments

Information & Contributors

Information

Published In

cover image International Journal of High Performance Computing Applications

International Journal of High Performance Computing Applications Volume 33, Issue 1

1 2019

221 pages

ISSN:1094-3420

Issue’s Table of Contents

Copyright © © The Authors 2017.

Publisher

Sage Publications, Inc.

United States

Publication History

Published: 01 January 2019

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 11 Aug 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents