Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/3014904.3015016acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

Refactoring and optimizing the community atmosphere model (CAM) on the sunway taihulight supercomputer

Published: 13 November 2016 Publication History

Abstract

This paper reports our efforts on refactoring and optimizing the Community Atmosphere Model (CAM) on the Sunway TaihuLight supercomputer, which uses a many-core processor that consists of management processing elements (MPEs) and clusters of computing processing elements (CPEs). To map the large code base of CAM to the millions of cores on the Sunway system, we take OpenACC-based refactoring as the major approach, and apply source-to-source translator tools to exploit the most suitable parallelism for the CPE cluster, and to fit the intermediate variable into the limited on-chip fast buffer. For individual kernels, when comparing the original ported version using only MPEs and the refactored version using both the MPE and CPE clusters, we achieve up to 22x speedup for the compute-intensive kernels. For the 25km resolution CAM global model, we manage to scale to 24,000 MPEs, and 1,536,000 CPEs, and achieve a simulation speed of 2.81 model years per day.

References

[1]
J. Drake, I. Foster, J. Michalakes, B. Toonen, and P. Worley, "Design and Performance of a Scalable Parallel Community Climate Model," Parallel Computing, pp. 1571--1591, 1995.
[2]
S. Shingu, H. Takahara, H. Fuchigami, M. Y. Y. Tsuda, M. Yamada, Y. Tsuda, and et. al., "A 26.58 Tflops Global Atmospheric Simulation with the Spectral Transform Method on the Earth Simulator," in In Proceedings of the ACM / IEEE Supercomputing SC2002 conference, 2002.
[3]
X.-J. Yang, X.-K. Liao, K. Lu, Q.-F. Hu, J.-Q. Song, and J.-S. Su, "The TianHe-1A Supercomputer: Its Hardware and Software," Journal of Computer Science and Technology, vol. 26, no. 3, pp. 344--351, 2011.
[4]
X. Liao, L. Xiao, C. Yang, and Y. Lu, "MilkyWay-2 super-computer: system and application," Frontiers of Computer Science, vol. 8, no. 3, pp. 345--356, 2014.
[5]
J. Michalakes and M. Vachharajani, "GPU acceleration of numerical weather prediction," in Parallel and Distributed Processing, 2008. IPDPS 2008. IEEE International Symposium on, April 2008, pp. 1--7.
[6]
R. Kelly, "GPU Computing for Atmospheric Modeling," Computing in Science and Engineering, vol. 12, no. 4, pp. 26--33, 2010.
[7]
Z. Wang, X. Xu, N. Xiong, L. Yang, and W. Zhao, "GPU Acceleration for GRAPES Meteorological Model," in High Performance Computing and Communications (HPCC), 2011 IEEE 13th International Conference on, Sept 2011, pp. 365--372.
[8]
I. Carpenter, R. Archibald, K. Evans, J. Larkin, P. Micikevicius, M. Norman, and et. al., "Progress towards accelerating HOMME on hybrid multi-core systems," International Journal of High Performance Computing Applications, vol. 27, no. 3, pp. 335--347, 2013.
[9]
T. Shimokawabe, T. Aoki, C. Muroi, J. Ishida, K. Kawano, T. Endo, and et. al., "An 80-Fold Speedup, 15.0 TFlops Full GPU Acceleration of Non-Hydrostatic Weather Model ASUCA Production Code," in High Performance Computing, Networking, Storage and Analysis (SC), 2010 International Conference for, Nov 2010, pp. 1--11.
[10]
S. Xu, X. Huang, Y. Zhang, H. Fu, L.-Y. Oey, F. Xu, and et. al., "gpuPOM: a GPU-based Princeton Ocean Model," Geoscientific Model Development Discussions, vol. 7, no. 6, pp. 7651--7691, 2014.
[11]
M. Govett, J. Middlecoff, and T. Henderson, "Directive-Based Parallelization of the NIM Weather Model for GPUs," in Accelerator Programming using Directives (WACCPD), 2014 First Workshop on, Nov 2014, pp. 55--61.
[12]
B. Cumming, C. Osuna, T. Gysi, M. Bianco, X. Lapillonne, O. Fuhrer, and T. C. Schulthess, "A review of the challenges and results of refactoring the community climate code COSMO for hybrid Cray HPC systems," Proceedings of Cray User Group, 2013.
[13]
J. M. Dennis, M. Vertenstein, P. H. Worley, A. A. Mirin, A. P. Craig, R. Jacob, and et. al., "Computational performance of ultra-high-resolution capability in the Community Earth System Model," International Journal of High Performance Computing Applications, vol. 26, no. 1, pp. 5--16, 2012.
[14]
R. B. Neale and et al., "Description of the NCAR Community Atmosphere Model (CAM 5.0)," Natl.Cent. for Atmos. Res., Boulder, Colo., Tech. Rep. Note NCAR/TN-4861STR.
[15]
J. M. Dennis, J. Edwards, K. J. Evans, O. Guba, P. H. Lauritzen, A. A. Mirin, and et. al., "CAM-SE: A scalable spectral element dynamical core for the Community Atmosphere Model," International Journal of High Performance Computing Applications, vol. 26, no. 1, pp. 74--89, 2012.
[16]
J. Linford, J. Michalakes, M. Vachharajani, and A. Sandu, "Multi-core acceleration of chemical kinetics for simulation and prediction," in High Performance Computing Networking, Storage and Analysis, Proceedings of the Conference on, Nov 2009, pp. 1--11.
[17]
H. Fu, J. Liao, J. Yang, L. Wang, Z.Song, X. Huang, C. Yang, W. Xue, F. Liu, F. Qiao, W. Zhao, X. Yin, C. Hou, C. Zhang, W. Ge, J. Zhang, Y. Wang, C. Zhou, and G. Yang, "The sunway taihulight supercomputer: system and applications," Science China Information Sciences, pp. 1--16, 2016. {Online}. Available: http://dx.doi.org/10.1007/s11432-016-5588-7
[18]
R. J. Small, J. Bacmeister, D. Bailey, A. Baker, S. Bishop, F. Bryan, J. Caron, J. Dennis, P. Gent, H.-m. Hsu, M. Jochum, D. Lawrence, E. Muoz, P. diNezio, T. Scheitlin, R. Tomas, J. Tribbia, Y.-h. Tseng, and M. Vertenstein, "A new synoptic scale resolving global climate simulation using the community earth system model," Journal of Advances in Modeling Earth Systems, vol. 6, no. 4, pp. 1065--1094, 2014. {Online}. Available: http://dx.doi.org/10.1002/2014MS000363
[19]
D. Quinlan and C. Liao, "The ROSE Source-to-Source Compiler Infrastructure," in Cetus Users and Compiler Infrastructure Workshop, in conjunction with PACT 2011, October 2011.

Cited By

View all
  • (2019)Optimizing the HOMME dynamical core for multicore platformsInternational Journal of High Performance Computing Applications10.1177/109434201984961833:5(1030-1045)Online publication date: 1-Sep-2019
  • (2018)Communication-Avoiding for Dynamical Core of Atmospheric General Circulation ModelProceedings of the 47th International Conference on Parallel Processing10.1145/3225058.3225140(1-10)Online publication date: 13-Aug-2018

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SC '16: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
November 2016
1034 pages
ISBN:9781467388153
  • Conference Chair:
  • John West

Sponsors

In-Cooperation

Publisher

IEEE Press

Publication History

Published: 13 November 2016

Check for updates

Author Tags

  1. atmospheric modeling
  2. many-core
  3. openACC
  4. optimization
  5. tool

Qualifiers

  • Research-article

Conference

SC16
Sponsor:

Acceptance Rates

SC '16 Paper Acceptance Rate 81 of 442 submissions, 18%;
Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)1
Reflects downloads up to 27 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2019)Optimizing the HOMME dynamical core for multicore platformsInternational Journal of High Performance Computing Applications10.1177/109434201984961833:5(1030-1045)Online publication date: 1-Sep-2019
  • (2018)Communication-Avoiding for Dynamical Core of Atmospheric General Circulation ModelProceedings of the 47th International Conference on Parallel Processing10.1145/3225058.3225140(1-10)Online publication date: 13-Aug-2018

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media