Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3295500.3356190acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

SW_GROMACS: accelerate GROMACS on Sunway TaihuLight

Published: 17 November 2019 Publication History

Abstract

GROMACS is one of the most popular Molecular Dynamic (MD) applications and is widely used in the field of chemical and bimolecular system study. Similar to other MD applications, it needs long run-time for large-scale simulations. Therefore, many high performance platforms have been employed to accelerate it, such as Knights Landing (KNL), Cell Processor, Graphics Processing Unit (GPU) and so on. As the third fastest supercomputer in the world, Sunway TaihuLight contains 40960 SW26010 processors and SW26010 is a typical many-core processor. To make full use of the superior computation ability of TaihuLight, we port GROMACS to SW26010 with following new strategies: (1) a new deferred update strategy; (2) a new update mark strategy; (3) a full pipeline acceleration. Furthermore, we redesign GROMACS to enable all possible vectorization. Experiments show that our implementation achieves better performance than both Intel KNL and Nvidia P100 GPU when using appropriate number of SW26010 processors for a fair comparison.

References

[1]
[n. d.]. The Benchmark of water case. ftp://ftp.gromacs.org/pub/benchmarks/water_GMX50_bare.tar.gz.
[2]
Mark James Abraham, Teemu Murtola, Roland Schulz, Szilárd Páll, Jeremy C Smith, Berk Hess, and Erik Lindahl. 2015. GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX 1 (2015), 19--25.
[3]
Sadaf Alam and Ugo Varetto. 2014. GROMACS on hybrid CPU-GPU and CPU-MIC clusters: Preliminary porting experiences, results and next steps.
[4]
Joshua A Anderson, Chris D Lorenz, and Alex Travesset. 2008. General purpose molecular dynamics simulations fully implemented on graphics processing units. Journal of computational physics 227, 10 (2008), 5342--5359.
[5]
Markus Deserno and Christian Holm. 1998. How to mesh up Ewald sums. II. An accurate error estimate for the particle-particle-particle-mesh algorithm. The Journal of Chemical Physics 109, 18 (1998), 7694--7701.
[6]
GROMACS development team. [n. d.]. GROMACS 5.1.5 version. http://manual.gromacs.org/documentation/5.1.5/download.html.
[7]
Wenqian Dong, Kenli Li, Letian Kang, Zhe Quan, and Keqin Li. 2018. Implementing molecular dynamics simulation on the Sunway TaihuLight system with heterogeneous many-core processors. Concurrency and Computation: Practice and Experience 30, 16 (2018), e4468.
[8]
Xiaohui Duan, Ping Gao, Tingjian Zhang, Meng Zhang, Weiguo Liu, Wusheng Zhang, Wei Xue, Haohuan Fu, Lin Gan, Dexun Chen, et al. 2018. Redesigning LAMMPS for peta-scale and hundred-billion-atom simulation on Sunway TaihuLight. In SC18: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 148--159.
[9]
Maria Eleftheriou, Blake Fitch, Aleksandr Rayshubskiy, TJ Christopher Ward, and Robert Germain. 2005. Performance measurements of the 3d FFT on the Blue Gene/L supercomputer. In European Conference on Parallel Processing. Springer, 795--803.
[10]
Ulrich Essmann, Lalith Perera, Max L Berkowitz, Tom Darden, Hsing Lee, and Lee G Pedersen. 1995. A smooth particle mesh Ewald method. The Journal of chemical physics 103, 19 (1995), 8577--8593.
[11]
Haohuan Fu, Junfeng Liao, Jinzhe Yang, Lanning Wang, Zhenya Song, Xiaomeng Huang, Yang Chao, Xue Wei, Fangfang Liu, and Fangli Qiao. 2016. The Sunway Taihu Light supercomputer:system and applications. Science China Information Sciences 59, 7 (2016), 072001.
[12]
Gerhard Hummer. 1995. The numerical accuracy of truncated Ewald sums for periodic systems with long-range Coulomb interactions. Chemical physics letters 235, 3--4 (1995), 297--302.
[13]
Christian Kriebel, Matthias Mecke, Jochen Winkelmann, Jadran Vrabec, and Johann Fischer. 1998. An equation of state for dipolar two-center Lennard-Jones molecules and its application to refrigerants. Fluid phase equilibria 142, 1--2 (1998), 15--32.
[14]
J Andrew McCammon, Bruce R Gelin, and Martin Karplus. 1977. Dynamics of folded proteins. Nature 267, 5612 (1977), 585.
[15]
William McDoniel, Markus Höhnerbach, Rodrigo Canales, Ahmed E Ismail, and Paolo Bientinesi. 2017. LAMMPSâĂŹPPPM Long-Range Solver for the Second Generation Xeon Phi. In International Supercomputing Conference. Springer, 61--78.
[16]
Trung Dac Nguyen. 2017. GPU-accelerated Tersoff potentials for massively parallel molecular dynamics simulations. Computer Physics Communications 212 (2017), 113--122.
[17]
Stephen Olivier, Jan Prins, Jeff Derby, and Ken V. Vu. 2007. Porting the GROMACS Molecular Dynamics Code to the Cell Processor. In IEEE International Parallel & Distributed Processing Symposium.
[18]
Szilárd Pall, Mark James Abraham, Carsten Kutzner, Berk Hess, and Erik Lindahl. 2014. Tackling exascale software challenges in molecular dynamics simulations with GROMACS. In International Conference on Exascale Applications and Software. Springer, 3--27.
[19]
Conor Parks, Lei Huang, Yang Wang, and Doraiswami Ramkrishna. 2017. Accelerating multiple replica molecular dynamics simulations using the Intel® Xeon PhiâĎć coprocessor. Molecular Simulation 43, 9 (2017), 714--723.
[20]
Shaoliang Peng, Xiaoyu Zhang, Yutong Lu, Xiangke Liao, Lu Kai, Canqun Yang, Liu Jie, Weiliang Zhu, and Dongqing Wei. 2017. mAMBER: A CPU/MIC collaborated parallel framework for AMBER on Tianhe-2 supercomputer. In IEEE International Conference on Bioinformatics & Biomedicine.
[21]
Steve Plimpton. [n. d.]. lammps website. https://lammps.sandia.gov/index.html.
[22]
SzilÃąrd PÃąll and Berk Hess. 2013. A flexible algorithm for calculating pair interactions on SIMD architectures. Computer Physics Communications 184, 12 (2013), 2641--2650.
[23]
Romelia Salomon-Ferrer, David A Case, and Ross C Walker. 2013. An overview of the Amber biomolecular simulation package. Wiley Interdisciplinary Reviews: Computational Molecular Science 3, 2 (2013), 198--210.
[24]
Frank Suits, MC Pitman, Jed W Pitera, William C Swope, and Robert S Germain. 2005. Overview of molecular dynamics techniques and early scientific results from the Blue Gene project. IBM Journal of Research and Development 49, 2.3 (2005), 475--487.
[25]
David Van Der Spoel, Erik Lindahl, Berk Hess, Gerrit Groenhof, Alan E Mark, and Herman JC Berendsen. 2005. GROMACS: fast, flexible, and free. Journal of computational chemistry 26, 16 (2005), 1701--1718.
[26]
Peter Welch. 1967. The use of fast Fourier transform for the estimation of power spectra: a method based on time averaging over short, modified periodograms. IEEE Transactions on audio and electroacoustics 15, 2 (1967), 70--73.
[27]
Bin Yang, Xu Ji, Xiaosong Ma, Xiyang Wang, Tianyu Zhang, Xiupeng Zhu, Nosayba El-Sayed, Haidong Lan, Yibo Yang, Jidong Zhai, et al. 2019. End-to-end I/O Monitoring on a Leading Supercomputer. In 16th {USENIX} Symposium on Networked Systems Design and Implementation ({NSDI} 19). 379--394.
[28]
Juekuan Yang, Yujuan Wang, and Yunfei Chen. 2007. GPU accelerated molecular dynamics simulation of thermal conductivities. J. Comput. Phys. 221, 2 (2007), 799--804.
[29]
Yang Yu, Hong An, Junshi Chen, Weihao Liang, Qingqing Xu, and Yong Chen. 2017. Pipelining Computation and Optimization Strategies for Scaling GROMACS on the Sunway Many-Core Processor. In International Conference on Algorithms & Architectures for Parallel Processing.

Cited By

View all
  • (2024)Distributed Heterogeneous Spiking Neural Network Simulator Using Sunway AcceleratorsBig Data Mining and Analytics10.26599/BDMA.2024.90200077:4(1301-1320)Online publication date: Dec-2024
  • (2024)SunwayLB: Enabling Extreme-Scale Lattice Boltzmann Method Based Computing Fluid Dynamics Simulations on Advanced Heterogeneous SupercomputersIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2023.334370635:2(324-337)Online publication date: Feb-2024
  • (2024)O2ath: an OpenMP offloading toolkit for the sunway heterogeneous manycore platformCCF Transactions on High Performance Computing10.1007/s42514-024-00191-16:3(274-286)Online publication date: 3-May-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SC '19: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
November 2019
1921 pages
ISBN:9781450362290
DOI:10.1145/3295500
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

  • IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 November 2019

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Funding Sources

  • Center for High Performance Computing and System Simulation
  • Pilot National Laboratory for Marine Science and Technology (Qingdao)
  • National Key R&D Program of China
  • National Natural Science Foundation of China

Conference

SC '19
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)46
  • Downloads (Last 6 weeks)1
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Distributed Heterogeneous Spiking Neural Network Simulator Using Sunway AcceleratorsBig Data Mining and Analytics10.26599/BDMA.2024.90200077:4(1301-1320)Online publication date: Dec-2024
  • (2024)SunwayLB: Enabling Extreme-Scale Lattice Boltzmann Method Based Computing Fluid Dynamics Simulations on Advanced Heterogeneous SupercomputersIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2023.334370635:2(324-337)Online publication date: Feb-2024
  • (2024)O2ath: an OpenMP offloading toolkit for the sunway heterogeneous manycore platformCCF Transactions on High Performance Computing10.1007/s42514-024-00191-16:3(274-286)Online publication date: 3-May-2024
  • (2024)swCUDA: Auto parallel code translation framework from CUDA to ATHREAD for new generation sunway supercomputerCCF Transactions on High Performance Computing10.1007/s42514-023-00159-76:4(439-458)Online publication date: 11-Jan-2024
  • (2023)Bio-ESMD: A Data Centric Implementation for Large-Scale Biological System Simulation on Sunway TaihuLight SupercomputerIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2022.322055934:3(881-893)Online publication date: 1-Mar-2023
  • (2023)SWPFOPLD: A Profiling and Optimizing Loader for SW26010Pro Processors2023 8th International Conference on Computer and Communication Systems (ICCCS)10.1109/ICCCS57501.2023.10151092(780-784)Online publication date: 21-Apr-2023
  • (2022)Cryo-EM Structure and Activator Screening of Human Tryptophan Hydroxylase 2Frontiers in Pharmacology10.3389/fphar.2022.90743713Online publication date: 15-Aug-2022
  • (2022)Optimization of Reactive Force Field Simulation: Refactor, Parallelization, and Vectorization for InteractionsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2021.309140833:2(359-373)Online publication date: 1-Feb-2022
  • (2022)Enabling Large-Scale Simulation of CAM on the Sunway TaihuLight SupercomputerIEEE Transactions on Computers10.1109/TC.2021.306342271:4(824-837)Online publication date: 1-Apr-2022
  • (2022)Accelerating cryo-EM Reconstruction of RELION on the New Sunway Supercomputer2022 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom)10.1109/ISPA-BDCloud-SocialCom-SustainCom57177.2022.00024(129-138)Online publication date: Dec-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media