Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

GPUs in subsurface simulation: an investigation

  • Original Article
  • Published:
Engineering with Computers Aims and scope Submit manuscript

Abstract

Subsurface simulations are computationally intensive tasks and require a considerable amount of time to solve. To gain computational speed in simulating subsurface scenarios, this paper investigates the use of graphical processing units (GPUs). Accelerators such as GPUs have a different architecture compared to the conventional central processing units, which necessitates a distinct approach. Various techniques that are well suited to deal with GPUs and simulating subsurface phenomena are explored with an emphasis on groundwater flow problems. Finite volume method with implicit time steps to solve groundwater scenarios is discussed and the associated challenges are highlighted. Krylov solvers used in large-scale systems along with preconditioners to improve convergence are described in detail. An appropriate solver preconditioner pair that provides speedup is identified to solve groundwater flow scenarios. The role of matrix storage formats is examined in a GPU environment, and recommendations that could further improve performance are made. This paper concludes by presenting simulation results that test the ideas explored on different generations of NVIDIA GPUs and the speedup attained in them.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Keyes DE et al (2013) Multiphysics simulations Challenges and opportunities. Int J High Perform C 27(1):4–83

    Article  Google Scholar 

  2. Barry DA (1990) Supercomputers and their use in modeling subsurface solute transport. Rev Geophys 28(3):277–295

    Article  Google Scholar 

  3. Steefel CI, et al (2014) Reactive transport codes for subsurface environmental simulation. Computat Geosci 19(3):445–478

    Article  MathSciNet  MATH  Google Scholar 

  4. Steefel CI, DePaolo DJ, Lichtner PC (2005) Reactive transport modeling: an essential tool and a new research approach for the Earth sciences. Earth Planet Sci Lett 240(3):539–558

    Article  Google Scholar 

  5. Xing H (2009) Advances in geocomputing. Springer, Berlin

    Google Scholar 

  6. Kirk DB, Wen-mei WH (2012) Programming massively parallel processors: a hands-on approach. Elsevier, San Francisco

  7. Jeffers J, Reinders J (2013) Intel Xeon Phi coprocessor high-performance programming. Elsevier, San Francisco

  8. Li R, Saad Y (2013) GPU-accelerated preconditioned iterative linear solvers. J Supercomput 63(2):443–466

    Article  Google Scholar 

  9. Bear J (1972) Dynamics of fluids in porous media. American Elsevier, New York

    MATH  Google Scholar 

  10. Anderson MP, Woessner WW (1992) Applied groundwater modeling: simulation of flow and advective transport. Academic Press, San Diego

    Google Scholar 

  11. Huyakorn PS, Pinder GF (1983) Computational methods in subsurface flow. Academic Press, San Diego

    MATH  Google Scholar 

  12. Bear J, Verruijt A (1987) Modeling groundwater flow and pollution. Kluwer Academic Publishers, Dordrecht

  13. Jasak H (1996) Error analysis and estimation for the finite volume method with applications to fluid flows, PhD dissertation. Imperial College of Science, Technology and Medicine, London

  14. Batu V (2005) Applied flow and solute transport modeling in aquifers: fundamental principles and analytical and numerical methods. CRC Press, Boca Raton

    Book  Google Scholar 

  15. Aziz K, Settari A (1979) Petroleum Reservoir Simulation. Applied Science Publishers Ltd, London

    Google Scholar 

  16. Kelley CT (1987) Iterative methods for linear and nonlinear equations. SIAM, Philadelphia

    MATH  Google Scholar 

  17. Saad Y (2003) Iterative methods for sparse linear systems. SIAM, Philadelphia

    Book  MATH  Google Scholar 

  18. Van der Vorst HA (2003). Iterative Krylov methods for large linear systems. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  19. Barrett R, Berry MW, Chan TF, Demmel J, Donato J, Dongarra J, Eijkhout V, Pozo R, Romine C, Van der Vorst H (1994) Templates for the solution of linear systems: building blocks for iterative methods. SIAM, Philadelphia

    Book  MATH  Google Scholar 

  20. Liesen J, Strakos Z (2012) Krylov subspace methods: principles and analysis. Oxford University Press, Oxford

    Book  MATH  Google Scholar 

  21. Knoll DA, Keyes DE (2004) Jacobian-free Newton–Krylov methods: a survey of approaches and applications. J Comput Phys 193(2):357–397

    Article  MathSciNet  MATH  Google Scholar 

  22. Hoemmen M (2010) Communication-avoiding Krylov subspace methods, PhD dissertation. University of California, Berkeley

  23. Benzi M (2002) Preconditioning techniques for large linear systems: a survey. J Comput Phys 182(2):418–477

    Article  MathSciNet  MATH  Google Scholar 

  24. Chen TY (2001) Preconditioning sparse matrices for computing eigenvalues and solving linear systems of equations, PhD dissertation. University of California, Berkeley

  25. Bell N, Garland M (2009) Implementing sparse matrix-vector multiplication on throughput-oriented processors. In: Proceedings of the conference on high performance computing networking, storage and analysis, ACM

  26. Bell N, Garland M (2008) Efficient sparse matrix-vector multiplication on CUDA 2(5). Nvidia Technical Report NVR-2008-004, Nvidia Corporation

  27. Lichtner P, et al (2015) PFLOTRAN user manual: a massively parallel reactive flow and transport model for describing surface and subsurface processes. No. LA-UR-15-20403. Los Alamos National Laboratory (LANL)

  28. Balay S, Abhyankar S, Adams M, Brown J, Brune P, Buschelman K, Eijkhout V, Gropp W, Kaushik D, Knepley M, McInnes LC (2014) Petsc users manual revision 3.5. Technical report. Argonne National Laboratory (ANL)

  29. Smith B, Bjorstad P, Gropp W (2004) Domain decomposition: parallel multilevel methods for elliptic partial differential equations. Cambridge University Press, Cambridge

    MATH  Google Scholar 

  30. Hammond GE, Lichtner PC (2010) Field-scale model for the natural attenuation of uranium at the Hanford 300 Area using high-performance computing. Water Resour Res 46(9):1–31

    Article  Google Scholar 

  31. Hammond GE, Lichtner PC, Mills RT (2014) Evaluating the performance of parallel subsurface simulators: An illustrative example with PFLOTRAN. Water Resour Res 50(1):208–228

    Article  Google Scholar 

  32. Mualem Y (1976) A new model for predicting the hydraulic conductivity of unsaturated porous media. Water Resour Res 12(3):513–522

    Article  Google Scholar 

  33. Van Genuchten MT (1980) A closed-form equation for predicting the hydraulic conductivity of unsaturated soils. Soil Sci Soc Am J 44(5):892–898

    Article  Google Scholar 

  34. Schaap MG, Van Genuchten MT (2006) A modified Mualem–van Genuchten formulation for improved description of the hydraulic conductivity near saturation. Vadose Zone J 5(1):27–34

    Article  Google Scholar 

  35. Peiró J and Sherwin S (2005) Finite difference, finite element and finite volume methods for partial differential equations. Handbook of materials modeling. Springer Netherlands, pp 2415–2446

  36. Pinder GF, Gray WG (1977) Finite element simulation in surface and subsurface hydrology. Academic Press, New York

  37. Archer RA (2000) Computing flow and pressure transients in heterogeneous media using boundary element methods, PhD dissertation. Stanford University

  38. Brebbia CA (1980) The boundary element method for engineers, 2nd edn. Pentech Press, London

    Google Scholar 

  39. Bajaj R (2009) An unstructured finite volume simulator for multiphase flow through fractured-porous media, MS dissertation. Massachusetts Institute of Technology

  40. Schäfer M (2006) Computational engineering-introduction to numerical methods. Springer, New York

  41. Dembo RS, Eisenstat SC, Steihaug T (1982) Inexact newton methods. SIAM J Numer Anal 19(2):400–408

    Article  MathSciNet  MATH  Google Scholar 

  42. Falgout RD (2006) An introduction to algebraic multigrid computing. Comput Sci Eng 8(6): 24–33

    Article  Google Scholar 

  43. Wu CT, Elman HC (2006) Analysis and comparison of geometric and algebraic multigrid for convection-diffusion equations. SIAM J Sci Comput 28(6):2208–2228

    Article  MathSciNet  MATH  Google Scholar 

  44. Stüben K (2001) A review of algebraic multigrid. J Comput Appl Math 128(1):281–309

    Article  MathSciNet  MATH  Google Scholar 

  45. Baskaran MM, Bordawekar R (2008) Optimizing sparse matrix-vector multiplication on GPUs using compile-time and run-time strategies. IBM Reserach Report, RC24704 (W0812-047)

  46. Anzt H, Tomov S, Luszczek P, Sawyer W, Dongarra J (2015) Acceleration of GPU-based Krylov solvers via data transfer reduction. Int J High Perform C 29(3):366–383

    Article  Google Scholar 

  47. Mei X, Chu X (2015) Dissecting GPU memory hierarchy through microbenchmarking. arXiv preprint arXiv:1509.02308

  48. Tomov S, Nath R, Du P, Dongarra J (2011) MAGMA users’ guide. ICL, UTK. http://icl.cs.utk.edu/projectsfiles/magma/doxygen

  49. Bell N, Hoberock J (2011) Thrust: a productivity-oriented library for CUDA. GPU computing Gems Jade Edition 1st edn. Morgan Kaufmann Publishers Inc. San Francisco

  50. Dalton S and Bell N CUSP: A C++ templated sparse matrix library. http://cusplibrary.github.com

  51. Yan S, Li C, Zhang Y, Zhou H (2014) yaSpMV: yet another SpMV framework on GPUs.ACM Sigplan Notices 49(8):107–118

    Article  Google Scholar 

  52. Brezina M, Falgout R, MacLachlan S, Manteuffel T, McCormick S, Ruge J (2005) Adaptive smoothed aggregation (α SA) multigrid. SIAM Rev 47(2):317–346

    Article  MathSciNet  MATH  Google Scholar 

  53. Demidov D, Ahnert K, Rupp K, Gottschling P (2013) Programming CUDA and OpenCL: a case study using modern C++ libraries. SIAM J Sci Comput 35(5):C453–C472

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The author expresses his gratitude towards his colleagues at Canadian Nuclear Laboratories (CNL) for their help and valuable inputs. The author would also like to thank the team members of PFLOTRAN and PETSc from Los Alamos National Laboratory (LANL) and Argonne National Laboratory (ANL), respectively, for the timely help they extended to clarify the doubts in integrating the codes for GPUs.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rajeev Das.

Ethics declarations

Funding

This research was funded by Atomic Energy of Canada Limited, under the Federal Nuclear Science and Technology Safety and Security Program, project 51200.02.09.

Conflict of interest

The author declares that he has no conflict of interest.

Appendix A: analysis of the flow scenario

Appendix A: analysis of the flow scenario

This section presents additional details of the flow simulation in the three Tesla cards involving different Krylov solvers and preconditioners. The following discussion will assist to clearly understand the simulation behavior described earlier.

Figure 13 illustrates the role of the TeslaK80 accelerator in reducing runtimes. The default BiCGStab solver and point-block ILU preconditioner are used to demonstrate the gain in speed. Table 4 shows the runtimes involving all three Tesla cards and the DIA format. This table compares the compute time associated with different solver and preconditioner pairs. The time >0.70E+06 s indicates a runtime of over a week. It is evident from Table 4 that GMRES/AMG provides the best performance in a GPU for the flow problem.

Table 4 Comparison of the runtimes (s) in Tesla cards using combinations of different Krylov solvers and preconditioners
Fig. 13
figure 13

Performance comparison of BiCGStab solver and ILU preconditioner in CPU and CPU + TeslaK80 GPU. The benefit of GPU acceleration is visible for examples beyond 64 × 64 × 64 Grid Points

The number of linear and nonlinear iterations in GMRES/AMG, BiCGStab/AMG, and BiCGStab/ILU is reported in Table 5. The steps are in the range of 38–45, increasing with the problem size. Identical trends in the iteration behavior are observed in the remaining cards for the three pairs and their performances are not reported here to avoid repetition. The GMRES/ILU is the least promising pair exhibiting large iteration counts. For example, the linear and nonlinear iteration counts in the smallest 32 × 32 × 32 problem size were 604, 275, and 213, respectively.

Table 5 Linear and nonlinear iteration counts in the TeslaK80 card

A number of factors determine the efficiency of a solver that includes the distribution of its eigenvalues and the oscillations involved. An effective preconditioner has the capability to overcome such disadvantages. GMRES case in the present scenario provides a good example that demonstrates this effect. The restarted GMRES converges very slowly and even stagnates at times for large systems when used with ILU, but this is not the case when AMG preconditioner is used. A well-implemented preconditioner can play a crucial role to improve performance, provided that it adequately uses the parallelism offered by GPUs and also manages the constraints present. Figure 14 shows the variation in iteration number as the grid size is increased. Table 4 and Fig. 14 present the fact that even though the iteration counts are lower in BiCGStab/AMG compared to GMRES/AMG, the work performed per iteration is less in the latter. This is exhibited by the slightly reduced runtimes associated with GMRES/AMG in the present example.

Fig. 14
figure 14

Iteration counts associated with the solver preconditioner pairs in TeslaK80

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Das, R. GPUs in subsurface simulation: an investigation. Engineering with Computers 33, 919–934 (2017). https://doi.org/10.1007/s00366-017-0506-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00366-017-0506-1

Keywords