Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3178433.3178441acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
research-article

Vectorization of a spectral finite-element numerical kernel

Published: 24 February 2018 Publication History

Abstract

In this paper, we present an optimized implementation of the Finite-Element Methods numerical kernel for SIMD vectorization. A typical application is the modelling of seismic wave propagation. In this case, the computations at the element level are generally based on nested loops where the memory accesses are non-contiguous. Moreover, the back and forth from the element level to the global level (e.g., assembly phase) is a serious brake for automatic vectorization by compilers and for efficient reuse of data at the cache memory levels. This is particularly true when the problem under study relies on an unstructured mesh. The application proxies used for our experiments were extracted from EFISPEC code that implements the spectral finite-element method to solve the elastodynamic equations. We underline that the intra-node performance may be further improved. Additionally, we show that standard compilers such as GNU GCC, Clang and Intel ICC are unable to perform automatic vectorization even when the nested loops were reorganized or when SIMD pragmas were added. Due to the irregular memory access pattern, we introduce a dedicated strategy to squeeze the maximum performance out of the SIMD units. Experiments are carried out on Intel Broadwell and Skylake platforms that respectively offer AVX2 and AVX-512 SIMD units. We believe that our vectorization approach may be generic enough to be adapted to other codes.

References

[1]
Ahmad Abdelfattah, Marc Baboulin, Veselin Dobrev, Jack J Dongarra, Christopher Earl, Joël Falcou, Azzam Haidar, Ian Karlin, Tzanio V Kolev, Ian Masliah, and Stanimire Tomov. 2016. High-performance Tensor Contractions for GPUs. In International Conference on Computational Science 2016 (ICCS 2016), Vol. 80. 108--118.
[2]
Alexander Breuer, Alexander Heinecke, and Michael Bader. 2016. Petascale Local Time Stepping for the ADER-DG Finite Element Method. In 2016 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2016, Chicago, IL, USA, May 23-27, 2016. 854--863.
[3]
Mércio Castro, Emilio Francesquini, Fabrice Dupros, Hideo Aochi, Philippe O. A. Navaux, and Jean-François Méhaut. 2016. Seismic wave propagation simulations on low-power and performance-centric manycores. Parallel Comput. 54 (2016), 108--120.
[4]
Emmanuel Chaljub, Emeline Maufroy, Peter Moczo, Jozef Kristek, Fabrice Hollender, Pierre-Yves Bard, Enrico Priolo, Peter Klin, Florent De Martin, Zhenguo Zhang, et al. 2015. 3-D numerical simulations of earthquake ground motion in sedimentary basins: testing accuracy through stringent models. Geophysical Journal International 201, 1 (2015), 90--111.
[5]
E. Darve Cris Cecka, Adrian J. Lew. 2010. Assembly of finite element methods on graphics processors. https://doi.org/85(5):640âĂŞ669
[6]
Yifeng Cui, Kim B Olsen, Thomas H Jordan, Kwangyoon Lee, Jun Zhou, Patrick Small, Daniel Roten, Geoffrey Ely, Dhabaleswar K Panda, Amit Chourasia, et al. 2010. Scalable earthquake simulation on petascale supercomputers. In High Performance Computing, Networking, Storage and Analysis (SC), 2010 International Conference for. IEEE, 1--20.
[7]
E. Cuthill and J. McKee. 1969. Reducing the Bandwidth of Sparse Symmetric Matrices. In Proceedings of the 1969 24th National Conference (ACM '69). ACM, New York, NY, USA, 157--172.
[8]
Florent De Martin. 2011. Verification of a Spectral-Element Method Code for the Southern California Earthquake Center LOH.3 Viscoelastic Case. Bull. Seism. Soc. Am. 101, 6 (2011), 2855--2865.
[9]
P. F. Fischer and E. M. Rønquist. 1994. Spectral-element methods for large scale parallel Navier-Stokes calculations. Comput. Methods Appl. Mech. Engrg. 116 (1994), 69--76.
[10]
Damien Genet, Abdou Guermouche, and George Bosilca. 2014. Assembly Operations for Multicore Architectures Using Task-Based Runtime Systems. Springer International Publishing, Cham.
[11]
Dominik Göddeke, Dimitri Komatitsch, Markus Geveler, Dirk Ribbrock, Nikola Rajovic, Nikola Puzovic, and Alex Ramírez. 2013. Energy efficiency vs. performance of the numerical solution of PDEs: An application study on a low-power ARM-based cluster. J. Comput. Physics 237 (2013), 132--150.
[12]
Dimitri Komatitsch, Jesús Labarta, and David Michéa. 2008. A Simulation of Seismic Wave Propagation at High Resolution in the Inner Core of the Earth on 2166 Processors of MareNostrum. Springer Berlin Heidelberg, Berlin, Heidelberg, 364--377.
[13]
D. Komatitsch and J. Tromp. 2002. Spectral-Element Simulations of Global Seismic Wave Propagation-I. Validation. Geophys. J. Int. 149, 2 (2002), 390--412.
[14]
Martin Kronbichler and Katharina Kormann. 2012. A generic interface for parallel cell-based finite element operator application. Computers & Fluids 63 (2012), 135--147.
[15]
Filip KruÅijel and Krzysztof BanaÅŹ. 2013. Vectorized OpenCL implementation of numerical integration for higher order finite elements. Computers & Mathematics with Applications 66, 10 (2013), 2030--2044.
[16]
Krzysztof BanaÅŹ and Filip KruÅijel and Jan BielaÅĎski. 2016. Finite element numerical integration for first order approximations on multi-and many-core architectures. Computer Methods in Applied Mechanics and Engineering 305 (2016), 827--848.
[17]
Y. Maday and A. T. Patera. 1989. Spectral element methods for the incompressible Navier-Stokes equations. State of the art survey in computational mechanics (1989), 71--143.
[18]
E Maufroy, E Chaljub, F Hollender, J Kristek, P Moczo, P Klin, E Priolo, A Iwaki, T Iwata, V Etienne, F De Martin, N. Theodulidis, M Manakou, C Guyonnet-Benaize, K Pitilakis, and P.-Y. Bard. 2015. Earthquake Ground Motion in the Mygdonian Basin, Greece: The E2VP Verification and Validation of 3D Numerical Simulation up to 4 Hz. Bulletin of the Seismological Society of America 105, 3 (2015), 1342--1364.
[19]
Peter Moczo, Jozef Kristek, and Martin Gélis. 2014. The Finite-Difference Modelling of Earthquake Motions: Waves and Ruptures. Cambridge University Press.
[20]
A. T. Patera. 1984. A spectral element method for fluid dynamics: laminar flow in a channel expansion. J. Comput. Phys. 54 (1984), 468--488.
[21]
Max Rietmann, Peter Messmer, Tarje Nissen-Meyer, Daniel Peter, Piero Basini, Dimitri Komatitsch, Olaf Schenk, Jeroen Tromp, Lapo Boschi, and Domenico Giardini. 2012. Forward and adjoint simulations of seismic wave propagation on emerging large-scale GPU architectures. In Proceedings of the ACM / IEEE Supercomputing SC'2012 conference, Jeffrey K. Hollingsworth (Ed.). IEEE Computer Society Press, Salt Lake City, United States, article n 38. ISBN: 978-1-4673-0804-5.
[22]
Daniel Roten, Yifeng Cui, Kim B. Olsen, Steven M. Day, Kyle Withers, William H. Savran, Peng Wang, and Dawei Mu. 2016. High-frequency nonlinear earthquake simulations on petascale heterogeneous supercomputers. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2016, Salt Lake City, UT, USA, November 13-18, 2016. 957--968.
[23]
Gauthier Sornet, Fabrice Dupros, and Sylvain Jubertie. 2017. A Multilevel Optimization Strategy to Improve the Performance of Stencil Computation. Procedia Computer Science 108 (2017), 1083--1092. International Conference on Computational Science, {ICCS} 2017, 12-14 June 2017, Zurich, Switzerland.
[24]
Ricardo Taborda and Jacobo Bielak. 2011. Large-scale earthquake simulation: computational seismology and complex engineering systems. Computing in Science & Engineering 13, 4 (2011), 14--27.
[25]
Loïc Thébault, Eric Petit, and Quang Dinh. 2015. Scalable and Efficient Implementation of 3D Unstructured Meshes Computation: A Case Study on Matrix Assembly. SIGPLAN Not. 50, 8 (Jan. 2015), 120--129.
[26]
Seiji Tsuboi, Kazuto Ando, Takayuki Miyoshi, Daniel Peter, Dimitri Komatitsch, and Jeroen Tromp. 2016. A 1.8 trillion degrees-of-freedom, 1.24 petaflops global seismic wave simulation on the K computer. IJHPCA 30, 4 (2016), 411--422.

Cited By

View all
  • (2024)Alternative Quadrant Representations with Morton Index and AVX2 Vectorization for AMR Algorithms within the p4est Software Library2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW63119.2024.00071(301-310)Online publication date: 27-May-2024
  • (2024)An approach for dynamically adaptable SIMD vectorization of FEM kernelsComputer Physics Communications10.1016/j.cpc.2024.109319(109319)Online publication date: Jul-2024
  • (2021)Influential parameters on 3-D synthetic ground motions in a sedimentary basin derived from global sensitivity analysisGeophysical Journal International10.1093/gji/ggab304227:3(1795-1817)Online publication date: 3-Aug-2021
  • Show More Cited By
  1. Vectorization of a spectral finite-element numerical kernel

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    WPMVP'18: Proceedings of the 2018 4th Workshop on Programming Models for SIMD/Vector Processing
    February 2018
    68 pages
    ISBN:9781450356466
    DOI:10.1145/3178433
    © 2018 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 24 February 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. FEM
    2. SIMD
    3. mini-app
    4. vectorization

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    PPoPP '18

    Acceptance Rates

    WPMVP'18 Paper Acceptance Rate 8 of 12 submissions, 67%;
    Overall Acceptance Rate 20 of 30 submissions, 67%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)8
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 26 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Alternative Quadrant Representations with Morton Index and AVX2 Vectorization for AMR Algorithms within the p4est Software Library2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW63119.2024.00071(301-310)Online publication date: 27-May-2024
    • (2024)An approach for dynamically adaptable SIMD vectorization of FEM kernelsComputer Physics Communications10.1016/j.cpc.2024.109319(109319)Online publication date: Jul-2024
    • (2021)Influential parameters on 3-D synthetic ground motions in a sedimentary basin derived from global sensitivity analysisGeophysical Journal International10.1093/gji/ggab304227:3(1795-1817)Online publication date: 3-Aug-2021
    • (2020)On the Usage of the Arm C Language Extensions for a High-Order Finite-Element Kernel2020 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER49012.2020.00081(576-579)Online publication date: Sep-2020
    • (2018)Performance Analysis of SIMD Vectorization of High-Order Finite-Element Kernels2018 International Conference on High Performance Computing & Simulation (HPCS)10.1109/HPCS.2018.00074(423-430)Online publication date: Jul-2018

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media