Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3126908.3126909acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

Redesigning CAM-SE for peta-scale climate modeling performance and ultra-high resolution on Sunway TaihuLight

Published: 12 November 2017 Publication History

Abstract

The Community Atmosphere Model (CAM) is ported, redesigned, and scaled to the full system of the Sunway TaihuLight, and provides peta-scale climate modeling performance. We refactored and optimized the complete code using OpenACC directives at the first stage. A more aggressive and finer-grained redesign is then applied on the CAM, to achieve finer memory control and usage, more efficient vectorization and compute and communication overlapping. We further improve the CAM performance of a 260-core Sunway processor to the range of 28 to 184 Intel CPU cores, and achieve a sustainable double-precision performance of 3.3 PFlops for a 750 m global simulation when using 10,075,000 cores. CAM on Sunway achieves the simulation speed of 3.4 and 21.5 simulation-year-per-day (SYPD) for global 25-km and 100-km resolution respectively; and enables us to perform, to our knowledge, the first simulation of the complete lifecycle of hurricane Katrina, and achieve close-to-observation simulation results for both track and intensity.

References

[1]
Jeffrey K Lazo, Megan Lawson, Peter H Larsen, and Donald M Waldman. US economic sensitivity to weather variability. Bulletin of the American Meteorological Society, 92(6):709--720, 2011.
[2]
Peter Lynch. The ENIAC forecasts: A re-creation. Bulletin of the American Meteorological Society, 89(1):45--55, 2008.
[3]
John M Dennis, Jim Edwards, Katherine J Evans, Oksana Guba, Peter H Lauritzen, Arthur A Mirin, Amik St-Cyr, Mark A Taylor, and Patrick H Worley. CAM-SE: A scalable spectral element dynamical core for the Community Atmosphere Model. The International Journal of High Performance Computing Applications, 26(1):74--89, 2012.
[4]
JE Kay, C Deser, A Phillips, A Mai, C Hannay, G Strand, JM Arblaster, SC Bates, G Danabasoglu, J Edwards, et al. The Community Earth System Model (CESM) large ensemble project: A community resource for studying climate change in the presence of internal climate variability. Bulletin of the American Meteorological Society, 96(8):1333--1349, 2015.
[5]
Xiangke Liao, Liquan Xiao, and et al. MilkyWay-2 supercomputer: system and application. Frontiers of Computer Science, 8(3):345--356, 2014.
[6]
Devesh Tiwari, Saurabh Gupta, George Gallarno, Jim Rogers, and Don Maxwell. Reliability lessons learned from GPU experience with the Titan supercomputer at Oak Ridge leadership computing facility. In Proceedings of the international conference for high performance computing, networking, storage and analysis, page 38. ACM, 2015.
[7]
Haohuan Fu, Junfeng Liao, Jinzhe Yang, Lanning Wang, Zhenya Song, Xiaomeng Huang, Chao Yang, Wei Xue, Fangfang Liu, Fangli Qiao, et al. The Sunway TaihuLight supercomputer: system and applications. Science China Information Sciences, 59(7):072001, 2016.
[8]
John Michalakes and Manish Vachharajani. GPU acceleration of numerical weather prediction. Parallel Processing Letters, 18(04):531--548, 2008.
[9]
Rory Kelly. GPU Computing for Atmospheric Modeling. Computing in Science and Engineering, 12(4):26--33, 2010.
[10]
Chao Yang, Wei Xue, Haohuan Fu, Hongtao You, Xinliang Wang, Yulong Ao, Fangfang Liu, Lin Gan, Ping Xu, Lanning Wang, et al. 10M-core scalable fully-implicit solver for nonhydrostatic atmospheric dynamics. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, page 6. IEEE Press, 2016.
[11]
Zhuowei Wang, Xianbin Xu, Naixue Xiong, Laurence T Yang, and Wuqing Zhao. GPU Acceleration for GRAPES Meteorological Model. In High Performance Computing and Communications (HPCC), 2011 IEEE 13th International Conference on, pages 365--372. IEEE.
[12]
S. Xu, X. Huang, Y. Zhang, H. Fu, L.-Y. Oey, F. Xu, and et. al. gpuPOM: a GPU-based Princeton Ocean Model. Geoscientific Model Development Discussions, 7(6):7651--7691, 2014.
[13]
Jule G Charney and Arnt Eliassen. A numerical method for predicting the perturbations of the middle latitude westerlies. Tellus, 1(2):38--54, 1949.
[14]
Masaki Satoh, Taro Matsuno, Hirofumi Tomita, and et. al. Nonhydrostatic icosahedral atmospheric model (NICAM) for global cloud resolving simulations. Journal of Computational Physics, 227(7):3486--3514, 2008.
[15]
Hironori Fudeyasu, Yuqing Wang, Masaki Satoh, Tomoe Nasuno, Hiroaki Miura, and Wataru Yanase. Global cloud-system-resolving model NICAM successfully simulated the lifecycles of two real tropical cyclones. Geophysical Research Letters, 35(22), 2008.
[16]
Peter Johnsen, Mark Straka, Melvyn Shapiro, Alan Norton, and Thomas Galarneau. Petascale WRF simulation of hurricane sandy: Deployment of NCSA's cray XE6 blue waters. In High Performance Computing, Networking, Storage and Analysis (SC), 2013 International Conference for, pages 1--7. IEEE, 2013.
[17]
J.C. Linford, J. Michalakes, M. Vachharajani, and A. Sandu. Multi-core acceleration of chemical kinetics for simulation and prediction. In High Performance Computing Networking, Storage and Analysis, Proceedings of the Conference on, pages 1--11, Nov 2009.
[18]
Jarno Mielikainen, Bormin Huang, Jun Wang, H.-L. Allen Huang, and Mitchell D. Goldberg. Compute unified device architecture (CUDA)-based parallelization of WRF Kessler cloud microphysics scheme. Computers and Geosciences, 52(0):292 -- 299, 2013.
[19]
Huadong Xiao, Jing Sun, Xiaofeng Bian, and Zhijun Dai. GPU acceleration of the WSM6 cloud microphysics scheme in GRAPES model. Computers and Geosciences, 59(0):156 -- 162, 2013.
[20]
Mark W Govett, Jacques Middlecoff, and Tom Henderson. Running the NIM next-generation weather model on GPUs. In Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing, pages 792--796. IEEE Computer Society.
[21]
Ilene Carpenter, RK Archibald, Katherine J Evans, Jeff Larkin, Paulius Micikevicius, Matt Norman, Jim Rosinski, Jim Schwarzmeier, and Mark A Taylor. Progress towards accelerating HOMME on hybrid multi-core systems. The International Journal of High Performance Computing Applications, 27(3):335--347, 2013.
[22]
V.T. Vu, G. Cats, and L. Wolters. Graphics processing unit optimizations for the dynamics of the HIRLAM weather forecast model. Concurrency and Computation: Practice and Experience, 25(10):1376--1393, 2013.
[23]
Irina Demeshko, Naoya Maruyama, Hirofumi Tomita, and Satoshi Matsuoka. Multi-GPU implementation of the NICAM atmospheric model. In European Conference on Parallel Processing, pages 175--184. Springer, 2012.
[24]
Fangli Qiao, Wei Zhao, Xunqiang Yin, Xiaomeng Huang, Xin Liu, Qi Shu, Guansuo Wang, Zhenya Song, Xinfang Li, Haixing Liu, et al. A highly effective global surface wave numerical simulation with ultra-high resolution. In High Performance Computing, Networking, Storage and Analysis, SC16: International Conference for, pages 46--56. IEEE.
[25]
Takashi Shimokawabe, Takayuki Aoki, Chiashi Muroi, Junichi Ishida, Kohei Kawano, Toshio Endo, Akira Nukada, Naoya Maruyama, and Satoshi Matsuoka. An 80-fold speedup, 15.0 TFlops full GPU acceleration of non-hydrostatic weather model ASUCA production code. In Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, pages 1--11. IEEE Computer Society.
[26]
Tobias Gysi, Carlos Osuna, Oliver Fuhrer, Mauro Bianco, and Thomas C. Schulthess. Stella: A domain-specific tool for structured grid methods in weather and climate models. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pages 41:1--41:12, New York, NY, USA, 2015. ACM.
[27]
Oliver Fuhrer, Carlos Osuna, Xavier Lapillonne, Tobias Gysi, Ben Cumming, Mauro Bianco, Andrea Arteaga, and Thomas Christoph Schulthess. Towards a performance portable, architecture agnostic implementation strategy for weather and climate models. Supercomputing frontiers and innovations, 1(1):45--62, 2014.
[28]
David E Shaw, Ron O Dror, John K Salmon, JP Grossman, Kenneth M Mackenzie, Joseph A Bank, Cliff Young, Martin M Deneroff, Brannon Batson, Kevin J Bowers, et al. Millisecond-scale molecular dynamics simulations on Anton. In High performance computing networking, storage and analysis, proceedings of the conference on, pages 1--11. IEEE, 2009.
[29]
David E Shaw, JP Grossman, Joseph A Bank, Brannon Batson, J Adam Butts, Jack C Chao, Martin M Deneroff, Ron O Dror, Amos Even, Christopher H Fenton, et al. Anton 2: raising the bar for performance and programmability in a special-purpose molecular dynamics supercomputer. In Proceedings of the international conference for high performance computing, networking, storage and analysis, pages 41--53. IEEE Press, 2014.
[30]
Andrew S Cassidy, Rodrigo Alvarez-Icaza, Filipp Akopyan, Jun Sawada, John V Arthur, Paul A Merolla, Pallab Datta, Marc Gonzalez Tallada, Brian Taba, Alexander Andreopoulos, et al. Real-time scalable cortical computing at 46 giga-synaptic OPS/watt with. In Proceedings of the international conference for high performance computing, networking, storage and analysis, pages 27--38. IEEE Press, 2014.
[31]
Ready for the Dawn of Aurora: NWChemEx granted time on new LCF super-computer to advance popular computational chemistry code. https://www.pnnl.gov/science/highlights/highlight.asp?id=4411. Accessed: 2017-02.
[32]
Marat Valiev, Eric J Bylaska, Niranjan Govind, Karol Kowalski, Tjerk P Straatsma, Hubertus JJ Van Dam, Dunyou Wang, Jarek Nieplocha, Edoardo Apra, Theresa L Windus, et al. NWChem: a comprehensive and scalable open-source solution for large scale molecular simulations. Computer Physics Communications, 181(9):1477--1489, 2010.
[33]
Paul Messina et. al. Revolutionizing High-Performance Computing. Technical report, Exa-Scale Computing Project.
[34]
Niall Ferguson. Civilization: the six killer apps of western power. Penguin UK, 2012.
[35]
John Michalakes, Rusty Benson, Tom Black, Michael Duda, Mark Govett, T Henderson, P Madden, George Mozdzynski, Alex Reinecke, William Skamarock, et al. Evaluating Performance and Scalability of Candidate Dynamical Cores for the Next Generation Global Prediction System. 2015.
[36]
Eric S Blake, Edward N Rappaport, Jerry D Jarrell, Chris Landsea, and Tropical Prediction Center. The deadliest, costliest, and most intense United States tropical cyclones from 1851 to 2006 (and other frequently requested hurricane facts). NOAA/National Weather Service, National Centers for Environmental Prediction, National Hurricane Center Miami, 2007.
[37]
Colin M Zarzycki, Christiane Jablonowski, and Mark A Taylor. Using variable-resolution meshes to model tropical cyclones in the community atmosphere model. Monthly Weather Review, 142(3):1221--1239, 2014.
[38]
Lennart Bengtsson, Kevin I Hodges, Monika Esch, Noel Keenlyside, Luis Kornblueh, JING-JIA LUO, and Toshio Yamagata. How may tropical cyclones change in a warmer climate? Tellus a, 59(4):539--561, 2007.
[39]
Yunji Chen, Tao Luo, Shaoli Liu, Shijin Zhang, Liqiang He, Jia Wang, Ling Li, Tianshi Chen, Zhiwei Xu, Ninghui Sun, et al. Dadiannao: A machine-learning supercomputer. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, pages 609--622. IEEE Computer Society, 2014.

Cited By

View all
  • (2024)Mixed-precision computing in the GRIST dynamical core for weather and climate modellingGeoscientific Model Development10.5194/gmd-17-6301-202417:16(6301-6318)Online publication date: 27-Aug-2024
  • (2024)O2ath: an OpenMP offloading toolkit for the sunway heterogeneous manycore platformCCF Transactions on High Performance Computing10.1007/s42514-024-00191-16:3(274-286)Online publication date: 3-May-2024
  • (2024)Evaluating the potential of disaggregated memory systems for HPC applicationsConcurrency and Computation: Practice and Experience10.1002/cpe.814736:19Online publication date: 31-May-2024
  • Show More Cited By
  1. Redesigning CAM-SE for peta-scale climate modeling performance and ultra-high resolution on Sunway TaihuLight

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      SC '17: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
      November 2017
      801 pages
      ISBN:9781450351140
      DOI:10.1145/3126908
      • General Chair:
      • Bernd Mohr,
      • Program Chair:
      • Padma Raghavan
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      In-Cooperation

      • IEEE CS

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 12 November 2017

      Permissions

      Request permissions for this article.

      Check for updates

      Qualifiers

      • Research-article

      Funding Sources

      • National Natural Science Foundation of China
      • China Postdoctoral Science Foundation
      • Tsinghua University Initiative Scientific Research Program
      • National Key R&D Program of China

      Conference

      SC '17
      Sponsor:

      Acceptance Rates

      SC '17 Paper Acceptance Rate 61 of 327 submissions, 19%;
      Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)24
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 30 Aug 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Mixed-precision computing in the GRIST dynamical core for weather and climate modellingGeoscientific Model Development10.5194/gmd-17-6301-202417:16(6301-6318)Online publication date: 27-Aug-2024
      • (2024)O2ath: an OpenMP offloading toolkit for the sunway heterogeneous manycore platformCCF Transactions on High Performance Computing10.1007/s42514-024-00191-16:3(274-286)Online publication date: 3-May-2024
      • (2024)Evaluating the potential of disaggregated memory systems for HPC applicationsConcurrency and Computation: Practice and Experience10.1002/cpe.814736:19Online publication date: 31-May-2024
      • (2023)A Set of New Optimization Methods Based on Sunway Many-core ProcessorProceedings of the 2023 7th International Conference on High Performance Compilation, Computing and Communications10.1145/3606043.3606056(92-100)Online publication date: 17-Jun-2023
      • (2023)The Optimization of Multi-physics Application Simulated by Lattice Boltzmann Method Based on Domestic ProcessorsProceedings of the 2023 2nd International Conference on Networks, Communications and Information Technology10.1145/3605801.3605810(42-47)Online publication date: 16-Jun-2023
      • (2023)The Simple Cloud-Resolving E3SM Atmosphere Model Running on the Frontier Exascale SystemProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3627044(1-11)Online publication date: 12-Nov-2023
      • (2023)Enabling Real World Scale Structural Superlubricity All-Atom Simulation on the Next-Generation Sunway SupercomputerProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3613210(1-14)Online publication date: 12-Nov-2023
      • (2023)InDe: An Inline Data Deduplication Approach via Adaptive Detection of Valid Container UtilizationACM Transactions on Storage10.1145/356842619:1(1-27)Online publication date: 11-Jan-2023
      • (2023)End-to-end I/O Monitoring on Leading SupercomputersACM Transactions on Storage10.1145/356842519:1(1-35)Online publication date: 11-Jan-2023
      • (2023)Oasis: Controlling Data Migration in Expansion of Object-based Storage SystemsACM Transactions on Storage10.1145/356842419:1(1-22)Online publication date: 19-Jan-2023
      • Show More Cited By

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media