Parallelization of torsion finite element code using compressed stiffness matrix algorithm

Sefidgar, Seyed Mohammad Hassan; Firoozjaee, Ali Rahmani; Dehestani, Mehdi

doi:10.1007/s00366-020-00952-w

Parallelization of torsion finite element code using compressed stiffness matrix algorithm

Original Article
Published: 07 February 2020

Volume 37, pages 2439–2455, (2021)
Cite this article

Engineering with Computers Aims and scope Submit manuscript

Seyed Mohammad Hassan Sefidgar¹,
Ali Rahmani Firoozjaee¹ &
Mehdi Dehestani¹

500 Accesses
Explore all metrics

Abstract

In the current study, the problems of elastic and elastoplastic torsion were formulated by finite element method. The finite element code was parallelized on both shared and distributed memory architectures. An assembling method with high parallelism ability and consuming minimum memory was proposed to obtain compressed global stiffness matrix directly. Parallel programming principles were expressed in two shared memory and distributed memory approaches; moreover, parallel well-known mathematical libraries were briefly expressed. In this paper, the main focus was on a lucid explanation of parallelization mechanisms in detail on two memory architectures such as some settings of Linux operating system for large-scale problems. To verify the ability of the proposed method and its parallel performance, several benchmark examples were represented with different mesh sizes and were compared with their respective analytical solutions. Considering the obtained results, the proposed sparse assembling algorithm decreased required memory significantly (about 10^3.5 to 10^5.5 times) and the obtained speedup was about 3.4 for the elastoplastic torsion problem in a simple multicore computer.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

swParaFEM: a highly efficient parallel finite element solver on Sunway many-core architecture

Article 28 February 2023

A mixed parallel strategy for the solution of coupled multi-scale problems at finite strains

Article 15 September 2017

An implementation of direct linear equation solver using a many-core CPU for mechanical dynamic analysis

Article 18 October 2017

Notes

Bi-conjugate gradient stabilized.
Conjugate gradient.
Generalized minimal residual method.
In computational authorities in-core technique means that store data on memory of computer instead of hard disk.
This subroutine is f_dcreate_matrix_dist; for more information, visit SuperLU user guide.

References

Parhami B (2002) Introduction to parallel processing: algorithms and architectures. Kluwer Academic
Kosec G et al (2014) Super linear speedup in a local parallel meshless solution of thermo-fluid problems. Comput Struct 133:30–38
Article Google Scholar
Kalro V et al (1997) Parallel finite element simulation of large ram-air parachutes. Int J Numer Methods Fluids 24(12):1353–1369
Article Google Scholar
Majumder S, Rixner S (2004) Comparing Ethernet and Myrinet for MPI communication. In: Proceedings of the 7th workshop on Workshop on languages, compilers, and run-time support for scalable systems (LCR 2004), pp 83–89, October 2004
Turner EL, Hu H (2001) A parallel CFD rotor code using OpenMP. Adv Eng Softw 32(8):665–671
Article Google Scholar
MPI Forum (1994) MPI: a message-passing interface standard. Int J Supercomput Appl High Perform Comput 8:159–416
Google Scholar
Nakajima K (2005) Parallel iterative solvers for finite-element methods using an OpenMP/MPI hybrid programming model on the Earth Simulator. Parallel Comput 31(10–12):1048–1065
Article Google Scholar
Bauza CG, et al. (2009) Parallel implementation of a fem code by using MPI/PETSC and OpenMP hybrid programming techniques
Vargas-Félix M, Botello-Rionda S (2012) Solution of finite element problems using hybrid parallelization with MPI and OpenMP. Acta Univ 22(7):14–24
Google Scholar
Grimes R, Lucas R, Wagenbreth G (2011) Progress on GPU implementation for LS-DYNA implicit mechanics. In: 8th European LS-DYNA users conference, Strasbourg
Garrison LH et al (2018) The abacus cosmos: a suite of cosmological N-body simulations. Astrophy J Suppl Ser 236(2):43
Article Google Scholar
Cheng J, Grossman M, McKercher T (2014) Professional Cuda C programming. Wiley
Cercos-Pita JL (2015) AQUAgpusph, a new free 3D SPH solver accelerated with OpenCL. Comput Phys Commun 192:295–312
Article MathSciNet Google Scholar
Domínguez JM et al (2013) New multi-GPU implementation for smoothed particle hydrodynamics on heterogeneous clusters. Comput Phys Commun 184(8):1848–1860
Article Google Scholar
Timoshenko S, Goodier J (1951) Theory of elasticity. McGraw-Hill Book Company Inc, New York
MATH Google Scholar
Smith JO, Sidebottom OM (1965) Inelastic behavior of load-carrying members. Wiley, New York, NY
Google Scholar
Hodge PG, Herakovich CT, Stout RB (1968) On numerical comparisons in elastic-plastic torsion. J Appl Mech 35(3):454–459
Article Google Scholar
Yamada Y, Nakagiri S, Takatsuka K (1972) Elastic-plastic analysis of Saint-Venant torsion problem by a hybrid stress model. Int J Numer Methods Eng 5(2):193–207
Article Google Scholar
May I, Al-Shaarbaf I (1989) Elasto-plastic analysis of torsion using a three-dimensional finite element model. Comput Struct 33(3):667–678
Article Google Scholar
Baniassadi M et al (2010) A novel semi-inverse solution method for elastoplastic torsion of heat treated rods. Meccanica 45(3):375–392
Article Google Scholar
Liu C-S (2007) A meshless regularized integral equation method (MRIEM) for Laplace equation in arbitrary interior or exterior plane domains. Proc ICCES 7:69–80
Google Scholar
Krupka J, Šimecek I (2010) Parallel solvers of Poisson’s equation. Department of Computer Systems, Faculty of Information Technology, Czech Technical University, Prague, MEMICS 2010
Koric S, Lu Q, Guleryuz E (2014) Evaluation of massively parallel linear sparse solvers on unstructured finite element meshes. Comput Struct 141:19–25
Article Google Scholar
Woźniak M et al (2015) Computational cost of isogeometric multi-frontal solvers on parallel distributed memory machines. Comput Methods Appl Mech Eng 284:971–987
Article MathSciNet Google Scholar
Koric S, Gupta A (2016) Sparse matrix factorization in the implicit finite element method on petascale architecture. Comput Methods Appl Mech Eng 302:281–292
Article MathSciNet Google Scholar
Naumov M (2011) Incomplete-LU and Cholesky preconditioned iterative methods using CUSPARSE and CUBLAS. Tech. rep., Technical Report and White Paper
Li A, Mazhar H, Serban R, Negrut D (2015) Comparison of SPMV performance on matrices with different matrix format using CUSP, cuSPARSE and ViennaCL, Technical Report TR-2015-02, SBEL, University of Wisconsin-Madison, Tech. Rep
Trost N, Jiménez J, Lukarski D, Sanchez V (2015) Accelerating COBAYA3 on Multi-Core CPU and GPU Systems using PARALUTION. Ann Nuclear Energy 82:252–259
Article Google Scholar
Sadd MH (2009) Elasticity: theory, applications, and numerics, 2nd edn. Boston Elsevier/AP, Amsterdam
Google Scholar
Bland J (1993) Implementation of an algorithm for elastoplastic torsion. Adv Eng Softw 17(1):61–68
Article Google Scholar
KoŁodziej JA, Gorzelańczyk P (2012) Application of method of fundamental solutions for elasto-plastic torsion of prismatic rods. Eng Anal Bound Elements 36(2):81–86
Article MathSciNet Google Scholar
Mukhtar FM, Al-Gahtani HJ (2016) Application of radial basis functions to the problem of elasto-plastic torsion of prismatic bars. Appl Math Model 40(1):436–450
Article MathSciNet Google Scholar
Koric S, Hibbeler LC, Thomas BG (2009) Explicit coupled thermo-mechanical finite element model of steel solidification. Int J Numer Methods Eng 78(1):1–31
Article Google Scholar
Li J et al (2012) Elastic–plastic transition in three-dimensional random materials: massively parallel simulations, fractal morphogenesis and scaling functions. Philos Mag 92(22):2733–2758
Article Google Scholar
Samii A, Michoski C, Dawson C (2016) A parallel and adaptive hybridized discontinuous Galerkin method for anisotropic nonhomogeneous diffusion. Comput Methods Appl Mech Eng 304:118–139
Article MathSciNet Google Scholar
Blackford LS, Choi J, Cleary A, D’Azevedo E, Demmel J, Dhillon I, Dongarra J, Hammarling S, Henry G, Petitet A, Stanley K, Walker D, Whaley R (1997) ScaLAPACK users’ guide, Society for Industrial and Applied Mathematics
Anderson E, Bai Z, Bischof C, Blackford LS, Demmel J, Dongarra J, Croz JD, Greenbaum A, Hammarling S, McKenney A et al (1999) LAPACK users' guide, SIAM, Philadelphia
Cebrián JM et al (2017) Code modernization strategies to 3-D Stencil-based applications on Intel Xeon Phi: KNC and KNL. Comput Math Appl 74(10):2557–2571
Article MathSciNet Google Scholar
Akhter S, Roberts J (2006) Multi-core programming: increasing performance through software multi-threading. Books by engineers for engineers. Intel Press
Silberschatz A, Galvin PB, Gagne G (2014) Operating system concepts essentials. Wiley
Duff IS, Grimes RG, Lewis JG (1992) Users’ guide for the Harwell-Boeing sparse matrix collection (release 1). Technical Report RAL-92-086, Rutherford Appleton Laboratory
Demmel JW, Gilbert J, Li XS (1997) SuperLU users’ guide. Computer Science Division, University of California, Berkeley, Tech. Rep. CSD-97-944
Li XS, Demmel JW (1999) SuperLU DIST: a scalable distributed-memory sparse direct solver for unsymmetric linear systems. ACM Trans Math Softw 29:110–140. https://doi.org/10.1145/779359.779361
Article MathSciNet Google Scholar
Snir M, Otto S, Huss-Lederman S, Walker D, Dongarra J (1998) MPI: the complete reference, vol 1. The MIT Press. ISBN 0262692155
Hermanns M (2002) Parallel programming in Fortran 95 using OpenMP, vol 75. Universidad Politecnica de Madrid, Madrid
Google Scholar
Geuzaine C, Remacle JF (2009) Gmsh: A 3-D finite element mesh generator with built-in pre-and post-processing facilities. Int J Numer Methods Eng 79(11):1309–1331
Article MathSciNet Google Scholar
Wagner W, Gruttmann F (2001) Finite element analysis of Saint-Venant torsion problem with exact integration of the elastic–plastic constitutive equations. Comput Methods Appl Mech Eng 190(29–30):3831–3848
Article Google Scholar
Manchanda N, Anand K (2010) Non-uniform memory access (numa). New York University 4

Download references

Author information

Authors and Affiliations

Department of Civil Engineering, Babol Noshirvani University of Technology, Babol, Mazandaran, Iran
Seyed Mohammad Hassan Sefidgar, Ali Rahmani Firoozjaee & Mehdi Dehestani

Authors

Seyed Mohammad Hassan Sefidgar
View author publications
You can also search for this author in PubMed Google Scholar
Ali Rahmani Firoozjaee
View author publications
You can also search for this author in PubMed Google Scholar
Mehdi Dehestani
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ali Rahmani Firoozjaee.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sefidgar, S.M.H., Firoozjaee, A.R. & Dehestani, M. Parallelization of torsion finite element code using compressed stiffness matrix algorithm. Engineering with Computers 37, 2439–2455 (2021). https://doi.org/10.1007/s00366-020-00952-w

Download citation

Received: 20 August 2019
Accepted: 11 January 2020
Published: 07 February 2020
Issue Date: July 2021
DOI: https://doi.org/10.1007/s00366-020-00952-w

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Parallelization of torsion finite element code using compressed stiffness matrix algorithm

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

swParaFEM: a highly efficient parallel finite element solver on Sunway many-core architecture

A mixed parallel strategy for the solution of coupled multi-scale problems at finite strains

An implementation of direct linear equation solver using a many-core CPU for mechanical dynamic analysis

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Parallelization of torsion finite element code using compressed stiffness matrix algorithm

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

swParaFEM: a highly efficient parallel finite element solver on Sunway many-core architecture

A mixed parallel strategy for the solution of coupled multi-scale problems at finite strains

An implementation of direct linear equation solver using a many-core CPU for mechanical dynamic analysis

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation