Search | arXiv e-print repository

An Evaluation and Comparison of GPU Hardware and Solver Libraries for Accelerating the OPM Flow Reservoir Simulator

Authors: Tong Dong Qiu, Andreas Thune, Markus Blatt, Alf Birger Rustad, Razvan Nane

Abstract: Realistic reservoir simulation is known to be prohibitively expensive in terms of computation time when increasing the accuracy of the simulation or by enlarging the model grid size. One method to address this issue is to parallelize the computation by dividing the model in several partitions and using multiple CPUs to compute the result using techniques such as MPI and multi-threading. Alternativ… ▽ More Realistic reservoir simulation is known to be prohibitively expensive in terms of computation time when increasing the accuracy of the simulation or by enlarging the model grid size. One method to address this issue is to parallelize the computation by dividing the model in several partitions and using multiple CPUs to compute the result using techniques such as MPI and multi-threading. Alternatively, GPUs are also a good candidate to accelerate the computation due to their massively parallel architecture that allows many floating point operations per second to be performed. The numerical iterative solver takes thus the most computational time and is challenging to solve efficiently due to the dependencies that exist in the model between cells. In this work, we evaluate the OPM Flow simulator and compare several state-of-the-art GPU solver libraries as well as custom developed solutions for a BiCGStab solver using an ILU0 preconditioner and benchmark their performance against the default DUNE library implementation running on multiple CPU processors using MPI. The evaluated GPU software libraries include a manual linear solver in OpenCL and the integration of several third party sparse linear algebra libraries, such as cuSparse, rocSparse, and amgcl. To perform our bench-marking, we use small, medium, and large use cases, starting with the public test case NORNE that includes approximately 50k active cells and ending with a large model that includes approximately 1 million active cells. We find that a GPU can accelerate a single dual-threaded MPI process up to 5.6 times, and that it can compare with around 8 dual-threaded MPI processes. △ Less

Submitted 20 September, 2023; originally announced September 2023.

arXiv:2101.01745 [pdf, other]

doi 10.1145/3476229

Hardware Acceleration of HPC Computational Flow Dynamics using HBM-enabled FPGAs

Authors: Tom Hogervorst, Tong Dong Qiu, Giacomo Marchiori, Alf Birger, Markus Blatt, Razvan Nane

Abstract: Scientific computing is at the core of many High-Performance Computing applications, including computational flow dynamics. Because of the uttermost importance to simulate increasingly larger computational models, hardware acceleration is receiving increased attention due to its potential to maximize the performance of scientific computing. A Field-Programmable Gate Array is a reconfigurable hardw… ▽ More Scientific computing is at the core of many High-Performance Computing applications, including computational flow dynamics. Because of the uttermost importance to simulate increasingly larger computational models, hardware acceleration is receiving increased attention due to its potential to maximize the performance of scientific computing. A Field-Programmable Gate Array is a reconfigurable hardware accelerator that is fully customizable in terms of computational resources and memory storage requirements of an application during its lifetime. Therefore, it is an ideal candidate to accelerate scientific computing applications because of the possibility to fully customize the memory hierarchy important in irregular applications such as iterative linear solvers found in scientific libraries. In this paper, we study the potential of using FPGA in HPC because of the rapid advances in reconfigurable hardware, such as the increase in on-chip memory size, increasing number of logic cells, and the integration of High-Bandwidth Memories on board. To perform this study, we first propose a novel ILU0 preconditioner tightly integrated with a BiCGStab solver kernel designed using a mixture of High-Level Synthesis and Register-Transfer Level hand-coded design. Second, we integrate the developed preconditioned iterative solver in Flow from the Open Porous Media (OPM) project, a state-of-the-art open-source reservoir simulator. Finally, we perform a thorough evaluation of the FPGA solver kernel in both standalone mode and integrated into the reservoir simulator that includes all the on-chip URAM and BRAM, on-board High-Bandwidth Memory, and off-chip CPU memory data transfers required in a complex simulator software such as OPM's Flow. We evaluate the performance on the Norne field, a real-world case reservoir model using a grid with more than 10^5 cells and using 3 unknowns per cell. △ Less

Submitted 5 January, 2021; originally announced January 2021.

Report number: Article No.: 20, pp 1--35

Journal ref: ACM Transactions on Reconfigurable Technology and Systems, Volume 15, Issue 2, June 2022

arXiv:1910.06059 [pdf, other]

The Open Porous Media Flow Reservoir Simulator

Authors: Atgeirr Flø Rasmussen, Tor Harald Sandve, Kai Bao, Andreas Lauser, Joakim Hove, Bård Skaflestad, Robert Klöfkorn, Markus Blatt, Alf Birger Rustad, Ove Sævareid, Knut-Andreas Lie, Andreas Thune

Abstract: The Open Porous Media (OPM) initiative is a community effort that encourages open innovation and reproducible research for simulation of porous media processes. OPM coordinates collaborative software development, maintains and distributes open-source software and open data sets, and seeks to ensure that these are available under a free license in a long-term perspective. In this paper, we presen… ▽ More The Open Porous Media (OPM) initiative is a community effort that encourages open innovation and reproducible research for simulation of porous media processes. OPM coordinates collaborative software development, maintains and distributes open-source software and open data sets, and seeks to ensure that these are available under a free license in a long-term perspective. In this paper, we present OPM Flow, which is a reservoir simulator developed for industrial use, as well as some of the individual components used to make OPM Flow. The descriptions apply to the 2019.10 release of OPM. △ Less

Submitted 4 October, 2019; originally announced October 2019.

Comments: 43 pages, 22 figures

MSC Class: 76S05; 68N01; 97N80

arXiv:1909.13672 [pdf, other]

The DUNE Framework: Basic Concepts and Recent Developments

Authors: Peter Bastian, Markus Blatt, Andreas Dedner, Nils-Arne Dreier, Christian Engwer, René Fritze, Carsten Gräser, Christoph Grüninger, Dominic Kempf, Robert Klöfkorn, Mario Ohlberger, Oliver Sander

Abstract: This paper presents the basic concepts and the module structure of the Distributed and Unified Numerics Environment and reflects on recent developments and general changes that happened since the release of the first Dune version in 2007 and the main papers describing that state [1, 2]. This discussion is accompanied with a description of various advanced features, such as coupling of domains and… ▽ More This paper presents the basic concepts and the module structure of the Distributed and Unified Numerics Environment and reflects on recent developments and general changes that happened since the release of the first Dune version in 2007 and the main papers describing that state [1, 2]. This discussion is accompanied with a description of various advanced features, such as coupling of domains and cut cells, grid modifications such as adaptation and moving domains, high order discretizations and node level performance, non-smooth multigrid methods, and multiscale methods. A brief discussion on current and future development directions of the framework concludes the paper. △ Less

Submitted 22 June, 2020; v1 submitted 30 September, 2019; originally announced September 2019.

Comments: 69 pages, 14 figures, 4 tables and various code examples

MSC Class: 76S05; 68N01

arXiv:1309.1783 [pdf, ps, other]

DUNE as an Example of Sustainable Open Source Scientific Software Development

Authors: Makus Blatt

Abstract: In this paper we describe how DUNE, an open source scientific software framework, is developed. Having a sustainable software framework for the solution of partial differential equations is the main driver of DUNE's development. We take a look how DUNE strives to stay sustainable software. In this paper we describe how DUNE, an open source scientific software framework, is developed. Having a sustainable software framework for the solution of partial differential equations is the main driver of DUNE's development. We take a look how DUNE strives to stay sustainable software. △ Less

Submitted 15 September, 2013; v1 submitted 6 September, 2013; originally announced September 2013.

ACM Class: D.2.9; K.6.1; K.6.3

arXiv:1209.0960 [pdf, ps, other]

A Massively Parallel Algebraic Multigrid Preconditioner based on Aggregation for Elliptic Problems with Heterogeneous Coefficients

Authors: Markus Blatt, Olaf Ippisch, Peter Bastian

Abstract: This paper describes a massively parallel algebraic multigrid method based on non-smoothed aggregation. It is especially suited for solving heterogeneous elliptic problems as it uses a greedy heuristic algorithm for the aggregation that detects changes in the coefficients and prevents aggregation across them. Using decoupled aggregation on each process with data agglomeration onto fewer processes… ▽ More This paper describes a massively parallel algebraic multigrid method based on non-smoothed aggregation. It is especially suited for solving heterogeneous elliptic problems as it uses a greedy heuristic algorithm for the aggregation that detects changes in the coefficients and prevents aggregation across them. Using decoupled aggregation on each process with data agglomeration onto fewer processes on the coarse level, it weakly scales well in terms of both total time to solution and time per iteration to nearly 300,000 cores. Because of simple piecewise constant interpolation between the levels, its memory consumption is low and allows solving problems with more than 100,000,000,000 degrees of freedom. △ Less

Submitted 30 September, 2013; v1 submitted 5 September, 2012; originally announced September 2012.

Comments: 22 pages, 1 figure

MSC Class: 65F08; 65N08; 65N55; 65Y05 ACM Class: F.2.1; G.1.3; G.1.8; G.4

Showing 1–6 of 6 results for author: Blatt, M