Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleJanuary 2024
Massively parallel simulations of multi-stage compressors on Sunway TaihuLight
The Journal of Supercomputing (JSCO), Volume 80, Issue 8Pages 11089–11128https://doi.org/10.1007/s11227-023-05862-4AbstractASPAC is an in-house computational fluid dynamics (CFD) software for the simulation of flow in turbomachinery. In this paper, with a dual-level hybrid and heterogeneous programming method, we optimized the ASPAC software and ran it on the Sunway ...
- research-articleSeptember 2023
ESA: An efficient sequence alignment algorithm for biological database search on Sunway TaihuLight
AbstractIn computational biology, biological database search has been playing a very important role. Since the COVID-19 outbreak, it has provided significant help in identifying common characteristics of viruses and developing vaccines and ...
Highlights- In this paper, we propose and implement an efficient sequence alignment algorithm, ESA, for biological database search on SW26010 heterogeneous processors. ...
- research-articleNovember 2022
SunwayURANS: 3D full-annulus URANS simulations of transonic axial compressors on Sunway TaihuLight
The Journal of Supercomputing (JSCO), Volume 78, Issue 17Pages 19167–19187https://doi.org/10.1007/s11227-022-04628-8AbstractThree-dimensional full-annulus Unsteady Reynolds Averaged Navier–Stokes (URANS) simulations play a crucial role in predicting the aerodynamic performance of the transonic axial compressor rotor. In this paper, we report our work, SunwayURANS, ...
- research-articleJanuary 2023
Automatically Generating High-performance Matrix Multiplication Kernels on the Latest Sunway Processor
ICPP '22: Proceedings of the 51st International Conference on Parallel ProcessingArticle No.: 52, Pages 1–12https://doi.org/10.1145/3545008.3545031We present an approach to the automatic generation of efficient matrix multiplication code on the latest Sunway processor, which will be employed by the next-generation machine of Sunway TaihuLight, one of the fastest supercomputers on earth. The ...
- research-articleMarch 2022
A new software cache structure on Sunway TaihuLight
The Journal of Supercomputing (JSCO), Volume 78, Issue 4Pages 4779–4798https://doi.org/10.1007/s11227-021-04056-0AbstractThe Sunway TaihuLight is the first supercomputer built entirely with domestic processors in China. On Sunway Taihulight, the local data memory (LDM) of the slave core is limited, so data transmission with the main memory is frequent during ...
- research-articleNovember 2020
Implementation and performance of Barnes-hut n-body algorithm on extreme-scale heterogeneous many-core architectures
- Masaki Iwasawa,
- Daisuke Namekata,
- Ryo Sakamoto,
- Takashi Nakamura,
- Yasuyuki Kimura,
- Keigo Nitadori,
- Long Wang,
- Miyuki Tsubouchi,
- Jun Makino,
- Zhao Liu,
- Haohuan Fu,
- Guangwen Yang
International Journal of High Performance Computing Applications (SAGE-HPCA), Volume 34, Issue 6Pages 615–628https://doi.org/10.1177/1094342020943652In this paper, we report the implementation and measured performance of our extreme-scale whole planetary ring simulation code on Sunway TaihuLight and two PEZY-SC2 systems: Shoubu System B and Gyoukou. The numerical algorithm is the parallel Barnes-Hut ...
- ArticleOctober 2020
Performance Modeling of Stencil Computation on SW26010 Processors
Algorithms and Architectures for Parallel ProcessingPages 386–400https://doi.org/10.1007/978-3-030-60245-1_27AbstractStencil computation is a basic part in a large variety of scientific computing programs, especially for those containing partial differential equations. Due to the limited memory bandwidth, it is a challenge to improve the parallel efficiency of ...
- research-articleMay 2020
Optimizing partitioned CSR-based SpGEMM on the Sunway TaihuLight
Neural Computing and Applications (NCAA), Volume 32, Issue 10Pages 5571–5582https://doi.org/10.1007/s00521-019-04121-zAbstractGeneral sparse matrix-sparse matrix (SpGEMM) multiplication is one of the basic kernels in a great many applications. Several works focus on various optimizations for SpGEMM. To fully exploit the powerful computing capability of the Sunway ...
- research-articleMarch 2020
Enabling Highly Efficient Batched Matrix Multiplications on SW26010 Many-core Processor
ACM Transactions on Architecture and Code Optimization (TACO), Volume 17, Issue 1Article No.: 3, Pages 1–23https://doi.org/10.1145/3378176We present a systematic methodology for optimizing batched matrix multiplications on SW26010 many-core processor of the Sunway TaihuLight supercomputer. Five surrogate algorithms and a machine learning–based algorithm selector are proposed to fully ...
- research-articleMarch 2020
Accelerating and tuning small matrix multiplications on Sunway TaihuLight: A case study of spectral element CFD Code Nek5000
International Journal of High Performance Computing Applications (SAGE-HPCA), Volume 34, Issue 2Pages 178–186https://doi.org/10.1177/1094342019882246The matrix–matrix products for matrices of small size have continued to play an important part in a range of scientific applications. The heterogeneous architecture, which is predicted to be a trend in the exascale supercomputing era, gives rises to the ...
- research-articleNovember 2019
OpenKMC: a KMC design for hundred-billion-atom simulation using millions of cores on Sunway Taihulight
- Kun Li,
- Honghui Shang,
- Yunquan Zhang,
- Shigang Li,
- Baodong Wu,
- Dong Wang,
- Libo Zhang,
- Fang Li,
- Dexun Chen,
- Zhiqiang Wei
SC '19: Proceedings of the International Conference for High Performance Computing, Networking, Storage and AnalysisArticle No.: 68, Pages 1–16https://doi.org/10.1145/3295500.3356165With more attention attached to nuclear energy, the formation mechanism of the solute clusters precipitation within complex alloys becomes intriguing research in the embrittlement of nuclear reactor pressure vessel (RPV) steels. Such phenomenon can be ...
- research-articleJuly 2019
Simulating the Wenchuan earthquake with accurate surface topography on Sunway TaihuLight
- Bingwei Chen,
- Haohuan Fu,
- Yanwen Wei,
- Conghui He,
- Wenqiang Zhang,
- Yuxuan Li,
- Wubin Wan,
- Wei Zhang,
- Lin Gan,
- Wei Zhang,
- Zhenguo Zhang,
- Guangwen Yang,
- Xiaofei Chen
SC '18: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and AnalysisArticle No.: 40, Pages 1–12https://doi.org/10.1109/SC.2018.00043This paper reports our efforts on performing a 50-m resolution earthquake simulation of the Wenchuan Earthquake (Ms 8.0, China) on Sunway TaihuLight. To accurately capture the surface topography, we adopt a curvilinear grid finite-difference method with ...
- research-articleNovember 2018
Simulating the Wenchuan earthquake with accurate surface topography on Sunway TaihuLight
- Bingwei Chen,
- Haohuan Fu,
- Yanwen Wei,
- Conghui He,
- Wenqiang Zhang,
- Yuxuan Li,
- Wubin Wan,
- Wei Zhang,
- Lin Gan,
- Wei Zhang,
- Zhenguo Zhang,
- Guangwen Yang,
- Xiaofei Chen
SC '18: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and AnalysisArticle No.: 40, Pages 1–12This paper reports our efforts on performing a 50-m resolution earthquake simulation of the Wenchuan Earthquake (Ms 8.0, China) on Sunway TaihuLight. To accurately capture the surface topography, we adopt a curvilinear grid finite-difference method with ...
- research-articleAugust 2018
A Fast Sparse Triangular Solver for Structured-grid Problems on Sunway Many-core Processor SW26010
ICPP '18: Proceedings of the 47th International Conference on Parallel ProcessingArticle No.: 53, Pages 1–11https://doi.org/10.1145/3225058.3225071The sparse triangular solver (SpTRSV) is one of the most essential kernels in many scientific and engineering applications. Efficiently parallelizing the SpTRSV on modern many-core architectures is considerably difficult due to inherent dependency of ...
- research-articleAugust 2018
Massively Scaling the Metal Microscopic Damage Simulation on Sunway TaihuLight Supercomputer
- Shigang Li,
- Baodong Wu,
- Yunquan Zhang,
- Xianmeng Wang,
- Jianjiang Li,
- Changjun Hu,
- Jue Wang,
- Yangde Feng,
- Ningming Nie
ICPP '18: Proceedings of the 47th International Conference on Parallel ProcessingArticle No.: 47, Pages 1–11https://doi.org/10.1145/3225058.3225064The limitation of simulation scales leads to a gap between simulation results and physical phenomena. This paper reports our efforts on increasing the scalability of metal material microscopic damage simulation on the Sunway TaihuLight supercomputer. We ...
- research-articleJune 2018
Extreme-Scale High-Order WENO Simulations of 3-D Detonation Wave with 10 Million Cores
ACM Transactions on Architecture and Code Optimization (TACO), Volume 15, Issue 2Article No.: 26, Pages 1–21https://doi.org/10.1145/3209208High-order stencil computations, frequently found in many applications, pose severe challenges to emerging many-core platforms due to the complexities of hardware architectures as well as the sophisticated computing and data movement patterns. In this ...
- research-articleMarch 2018
Performance Optimization of the HPCG Benchmark on the Sunway TaihuLight Supercomputer
ACM Transactions on Architecture and Code Optimization (TACO), Volume 15, Issue 1Article No.: 11, Pages 1–20https://doi.org/10.1145/3182177In this article, we present some key techniques for optimizing HPCG on Sunway TaihuLight and demonstrate how to achieve high performance in memory-bound applications by exploiting specific characteristics of the hardware architecture. In particular, we ...
- research-articleNovember 2017
18.9-Pflops nonlinear earthquake simulation on Sunway TaihuLight: enabling depiction of 18-Hz and 8-meter scenarios
- Haohuan Fu,
- Conghui He,
- Bingwei Chen,
- Zekun Yin,
- Zhenguo Zhang,
- Wenqiang Zhang,
- Tingjian Zhang,
- Wei Xue,
- Weiguo Liu,
- Wanwang Yin,
- Guangwen Yang,
- Xiaofei Chen
SC '17: Proceedings of the International Conference for High Performance Computing, Networking, Storage and AnalysisArticle No.: 2, Pages 1–12https://doi.org/10.1145/3126908.3126910This paper reports our large-scale nonlinear earthquake simulation software on Sunway TaihuLight. Our innovations include: (1) a customized parallelization scheme that employs the 10 million cores efficiently at both the process and the thread levels; (...