Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content
A computational Fluid Dynamics (CFD) code for steady simulations solves a set of non-linear partial differential equations using an iterative time stepping process, which could follow an explicit or an implicit scheme. On the CPU, the... more
A computational Fluid Dynamics (CFD) code for steady simulations solves a set of non-linear partial differential equations using an iterative time stepping process, which could follow an explicit or an implicit  scheme. On the CPU, the difference between both time stepping methods with respect to stability and performance has been well covered in the literature. However, it has not been extended to consider modern high-performance computing systems such as Graphics Processing Units (GPU). In this work, we first present an implementation of the two time-stepping methods on the GPU, highlighting the different challenges on the programming approach. Then we introduce a classification of basic CFD operations, found on the degree of parallelism they expose, and study the potential of GPU acceleration for every class. The classification provides local speedups of basic operations, which are finally used to compare the performance of both methods on the GPU. The target of this work is to enable an informed-decision on the most efficient combination of hardware and method when facing a new application. Our findings prove, that the choice between explicit and implicit time integration relies mainly on the convergence of explicit solvers and the efficiency of preconditioners on the GPU.
Research Interests:
A continuum hypothesis-based model is developed for the simulation of the contraction of burns in order to gain new insights into which elements of the healing response might have a substantial influence on this process. Tissue is modeled... more
A continuum hypothesis-based model is developed for the simulation of the contraction of burns in order to gain new insights into which elements of the healing response might have a substantial influence on this process. Tissue is modeled as a neo-Hookean solid. Furthermore, (myo)fibroblasts, collagen molecules, and a generic signaling molecule are selected as model components. An overview of the custom-made numerical algorithm is presented. Subsequently, good agreement is demonstrated with respect to variability in the evolution of the surface area of burns over time between the outcomes of computer simulations and measurements obtained in an experimental study. In the model this variability is caused by varying the values for some of its parameters simultaneously. A factorial design combined with a regression analysis are used to quantify the individual contributions of these parameter value variations to the dispersion in the surface area of healing burns. The analysis shows that almost all variability in the surface area can be explained by variability in the value for the myofibroblast apoptosis rate and, to a lesser extent, the value for the collagen molecule secretion rate. This suggests that most of the variability in the evolution of the surface area of burns over time in the experimental study might be attributed to variability in these two rates. Finally, a probabilistic analysis is used in order to investigate in more detail the effect of variability in the values for the two rates on the healing process. Results of this analysis are presented and discussed.
A continuum hypothesis-based, biomechanical model is presented for the simulation of the collagen bundle distribution-dependent contraction and subsequent retraction of healing dermal wounds that cover a large surface area. Since wound... more
A continuum hypothesis-based, biomechanical model is presented for the simulation of the collagen bundle distribution-dependent contraction and subsequent retraction of healing dermal wounds that cover a large surface area. Since wound contraction mainly takes place in the dermal layer of the skin, solely a portion of this layer is included explicitly into the model. This portion of dermal layer is modeled as a heterogeneous, orthotropic continuous solid with bulk mechanical properties that are locally dependent on both the local concentration and the local geometrical arrangement of the collagen bundles. With respect to the dynamic regulation of the geometrical arrangement of the collagen bundles, it is assumed that a portion of the collagen molecules are deposited and reoriented in the direction of movement of (myo)fibroblasts. The remainder of the newly secreted collagen molecules are deposited by ratio in the direction of the present collagen bundles. Simulation results show that the distribution of the collagen bundles influences the evolution over time of both the shape of the wounded area and the degree of overall contraction of the wounded area. Interestingly, these effects are solely a consequence of alterations in the initial overall distribution of the collagen bundles, and not a consequence of alterations in the evolution over time of the different cell densities and concentrations of the modeled constituents. In accordance with experimental observations, simulation results show furthermore that ultimately the majority of the collagen molecules ends up permanently oriented toward the center of the wound and in the plane that runs parallel to the surface of the skin.
A continuum hypothesis-based model is presented for the simulation of the formation and the subsequent regression of hypertrophic scar tissue after dermal wounding. Solely the dermal layer of the skin is modeled explicitly and it is... more
A continuum hypothesis-based model is presented for the simulation of the formation and the subsequent regression of hypertrophic scar tissue after dermal wounding. Solely the dermal layer of the skin is modeled explicitly and it is modeled as a heterogeneous, isotropic and compressible neo-Hookean solid. With respect to the constituents of the dermal layer, the following components are selected as primary model components: fibroblasts, myofibroblasts, a generic signaling molecule and collagen molecules. A good match with respect to the evolution of the thickness of the dermal layer of scars between the outcomes of simulations and clinical measurements on hypertrophic scars at different time points after injury in human subjects is demonstrated. Interestingly, the comparison between the outcomes of the simulations and the clinical measurements demonstrates that a relatively high apoptosis rate of myofibroblasts results in scar tissue that behaves more like normal scar tissue with re...
ABSTRACT The present work explores the massively parallel capabilities of the most advanced architecture of graphics processing units (GPUs) code named "Fermi", on a two-dimensional unstructured cell-centred finite... more
ABSTRACT The present work explores the massively parallel capabilities of the most advanced architecture of graphics processing units (GPUs) code named "Fermi", on a two-dimensional unstructured cell-centred finite volume code. We use the SIMPLE algorithm to solve the continuity and momentum equations that was fully ported to the GPU. The benefits of this implementation are compared with a serial implementation that traditionally runs on the central processing unit (CPU). The developed codes were assessed with the bench-mark problems of Poiseuille flow, for Newtonian and generalized Newtonian fluids, as well as by the lid-driven cavity and the sudden expansion flows for Newtonian fluids. The parallel (GPU) code accelerated the resolution of those three problems by factors of 19, 10 and 11, respectively, in comparison with the corresponding CPU single core counterpart. The results are a clear indication that GPUs are and will be useful in the field of computational fluid dynamics (CFD) for rheologically simple and complex fluids.
... for Bubbly Flow Problems Jok Man Tang and Kees Vuik ... Sousa, FS, Mangiavacchi, N., Nonato, LG, Castelo, A., Tome, MF, Ferreira, VG, Cuminato, JA, McKee, S.: A Front-Tracking / Front-Capturing Method for the Simulation of 3D... more
... for Bubbly Flow Problems Jok Man Tang and Kees Vuik ... Sousa, FS, Mangiavacchi, N., Nonato, LG, Castelo, A., Tome, MF, Ferreira, VG, Cuminato, JA, McKee, S.: A Front-Tracking / Front-Capturing Method for the Simulation of 3D Multi-Fluid Flows with Free Surfaces. J. Comp. ...
In this paper we compare two recently proposed methods, FGMRES (Saad, 1993) and GMRESR (van der Vorst and Vuik, 1994), for the iterative solution of sparse linear systems with an unsymmetric nonsingular matrix. Both methods compute... more
In this paper we compare two recently proposed methods, FGMRES (Saad, 1993) and GMRESR (van der Vorst and Vuik, 1994), for the iterative solution of sparse linear systems with an unsymmetric nonsingular matrix. Both methods compute minimal residual approximations using preconditioners, which may be different from step to step. The insights resulting from this comparison lead to better variants of
DELFT UNIVERSITY OF TECHNOLOGY REPORT 11-15 Efficient Two-Level Preconditionined Conjugate Gradient Method on the GPU. Rohit Gupta, Martin B. van Gijzen and Kees Vuik ISSN 1389-6520 Reports of the Department of Applied Mathematical... more
DELFT UNIVERSITY OF TECHNOLOGY REPORT 11-15 Efficient Two-Level Preconditionined Conjugate Gradient Method on the GPU. Rohit Gupta, Martin B. van Gijzen and Kees Vuik ISSN 1389-6520 Reports of the Department of Applied Mathematical Analysis Delft 2011 ...
ABSTRACT
... Instead, a method based on GCR 23] is used. ... usion equation, see 9]. Approximate subdomainsolution using a single iteration with ILUD factorization reduced multi-block computing time to almost that of single-block computing time... more
... Instead, a method based on GCR 23] is used. ... usion equation, see 9]. Approximate subdomainsolution using a single iteration with ILUD factorization reduced multi-block computing time to almost that of single-block computing time ... Parallel computing is of increasing importance ...
We present an iterative method of preconditioned Krylov type for the solution of large least squares problems. We prove that the method is robust and investigate its rate of convergence. For an important application, originating from... more
We present an iterative method of preconditioned Krylov type for the solution of large least squares problems. We prove that the method is robust and investigate its rate of convergence. For an important application, originating from seismic inverse scattering, we derive a suitable preconditioner using asymptotic theory. Numerical experiments are used to compare the method with other iterative methods. It appears that the preconditioned Krylov method can be much more efficient than CG applied to the normal equations.
In this document the Mumie pilot that took place in March 2010 for the Linear Algebra course (wi1403lr) at Aerospace Engineering will be evaluated. This pilot is the result of an interest in using an e-learning platform that can improve... more
In this document the Mumie pilot that took place in March 2010 for the Linear Algebra course (wi1403lr) at Aerospace Engineering will be evaluated. This pilot is the result of an interest in using an e-learning platform that can improve the level of education for first year mathematical courses at TU Delft. In order to be successful with such projects
This report gives an overview of the development and experiences of using Mumie [1] at TU Delft during the academic year 2010-2011. Mumie is an e-learning platform that can be used for mathematical courses and acquired the interest of TU... more
This report gives an overview of the development and experiences of using Mumie [1] at TU Delft during the academic year 2010-2011. Mumie is an e-learning platform that can be used for mathematical courses and acquired the interest of TU Delft at the beginning of 2009 in order to be used in their first years mathematical courses, in particular Linear
ABSTRACT We propose and analyze a generic multi-class kinematic wave traffic flow model: Fastlane. The model takes into account heterogeneity among driver-vehicle units with respect to speed and space occupancy: long vehicles with large... more
ABSTRACT We propose and analyze a generic multi-class kinematic wave traffic flow model: Fastlane. The model takes into account heterogeneity among driver-vehicle units with respect to speed and space occupancy: long vehicles with large headways (e.g. trucks) take more space than short vehicles with short headways (e.g. passenger cars). Moreover, and this is what makes the model unique, this effect is larger when the traffic volume is higher. This state dependent space occupancy is reflected in dynamic passenger car equivalent values. The resulting model is shown to satisfy important requirements such as providing a unique solution and being anisotropic. Simulations are applied to compare Fastlane to other multi-class models. Furthermore, we show that the characteristic velocity depends on the truck share, which is one of the main consequences of our modeling approach.
Research Interests:
ABSTRACT
A challenge of today‖s research is the realistic simulation of disordered atomistic systems or particulate and granular materials like sand, powders, ceramics or composites, which consist of many millions of atoms/particles. The... more
A challenge of today‖s research is the realistic simulation of disordered atomistic systems or particulate and granular materials like sand, powders, ceramics or composites, which consist of many millions of atoms/particles. The inhomogeneous fine-structure of such materials makes it very difficult to treat these with continuum methods, which typically assume homogeneity and scale separation. As an alternative, particle based methods
... the nite element method Guus Segal and Kees Vuik ... To that end we assume that the nodal points have been renumbered before, in order to get a small pro le for example by the standard reversed Cuthill-McKee renumbering algorithm 14]... more
... the nite element method Guus Segal and Kees Vuik ... To that end we assume that the nodal points have been renumbered before, in order to get a small pro le for example by the standard reversed Cuthill-McKee renumbering algorithm 14] or the algorithm proposed by Sloan 21]. ...
ABSTRACT
... Instead, a method based on GCR 23] is used. ... usion equation, see 9]. Approximate subdomainsolution using a single iteration with ILUD factorization reduced multi-block computing time to almost that of single-block computing time... more
... Instead, a method based on GCR 23] is used. ... usion equation, see 9]. Approximate subdomainsolution using a single iteration with ILUD factorization reduced multi-block computing time to almost that of single-block computing time ... Parallel computing is of increasing importance ...

And 4 more