Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit

Permalink
Merge pull request #1086 from Katetc/katetc/new_clubb_061424
Browse files Browse the repository at this point in the history
cam6_4_019: New CLUBB External to fix GPU problem
Small answer changes, but no new CLUBB science. Diagnostics can be found here:
https://webext.cgd.ucar.edu/F2000climo/newCLUBBtesting/larson_tag_20240605.katemerge.062724-0201.F2000dev.f09_f09_mg17_1_2_vs_larson_tag_control.cam6_3_162.062724-1359.F2000dev.f09_f09_mg17_1_2/website/index.html

Ran 6 year ne30 BLT1850 simulations to look at differences in applicable and longer runs. Had some trouble with diagnostics but the results can be found here:
https://webext.cgd.ucar.edu/FHIST/clubb_tests/larson_tag_20240605.katemerge.071924-1633.FLTHIST.ne30pg3_g17_1979_1984_vs_larson_tag_control.cam6_4_007.071924-1639.FLTHIST.ne30pg3_g17_1979_1984/website/

Differences seem to be minimal.

Fixes #1036
closes #1048
  • Loading branch information
Katetc committed Aug 12, 2024
2 parents 0921f39 + ed684d2 commit eb27509
Show file tree
Hide file tree
Showing 21 changed files with 481 additions and 402 deletions.
4 changes: 2 additions & 2 deletions .gitmodules
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@
url = https://github.com/larson-group/clubb_release
fxrequired = AlwaysRequired
fxsparse = ../.clubb_sparse_checkout
fxtag = clubb_4ncar_20231115_5406350
fxtag = clubb_4ncar_20240605_73d60f6_gpufixes_posinf
fxDONOTUSEurl = https://github.com/larson-group/clubb_release

[submodule "cism"]
Expand Down Expand Up @@ -151,7 +151,7 @@ fxDONOTUSEurl = https://github.com/ESCOMP/CMEPS.git
[submodule "cdeps"]
path = components/cdeps
url = https://github.com/ESCOMP/CDEPS.git
fxtag = cdeps1.0.43
fxtag = cdeps1.0.45
fxrequired = ToplevelRequired
fxDONOTUSEurl = https://github.com/ESCOMP/CDEPS.git

Expand Down
4 changes: 4 additions & 0 deletions bld/namelist_files/namelist_defaults_cam.xml
Original file line number Diff line number Diff line change
Expand Up @@ -2241,6 +2241,7 @@
<do_hb_above_clubb >.false. </do_hb_above_clubb>
<do_hb_above_clubb phys="cam6" >.true. </do_hb_above_clubb>
<do_hb_above_clubb phys="cam7" >.true. </do_hb_above_clubb>

<!-- SILHS options -->
<clubb_do_icesuper silhs="1" > .true. </clubb_do_icesuper>
<clubb_C2rt silhs="1" > 0.2 </clubb_C2rt>
Expand All @@ -2266,6 +2267,7 @@
<clubb_skw_max_mag silhs="1" > 10.0 </clubb_skw_max_mag>
<clubb_up2_sfc_coef silhs="1" > 4.0 </clubb_up2_sfc_coef>
<clubb_C_wp2_splat silhs="1" > 0.0 </clubb_C_wp2_splat>
<clubb_bv_efold silhs="1" > 5.0 </clubb_bv_efold>

<clubb_l_brunt_vaisala_freq_moist silhs="1" > .true. </clubb_l_brunt_vaisala_freq_moist>
<clubb_l_call_pdf_closure_twice silhs="1" > .false. </clubb_l_call_pdf_closure_twice>
Expand All @@ -2284,6 +2286,8 @@
<clubb_l_vert_avg_closure silhs="1" > .false. </clubb_l_vert_avg_closure>
<clubb_l_diag_Lscale_from_tau silhs="1" > .false. </clubb_l_diag_Lscale_from_tau>
<clubb_l_damp_wp2_using_em silhs="1" > .false. </clubb_l_damp_wp2_using_em>
<clubb_wpxp_Ri_exp silhs="1" > 0.5 </clubb_wpxp_Ri_exp>
<clubb_z_displace silhs="1" > 25.00 </clubb_z_displace>


<!-- CLUBB+MF options -->
Expand Down
63 changes: 18 additions & 45 deletions cime_config/testdefs/testlist_cam.xml
Original file line number Diff line number Diff line change
Expand Up @@ -1480,6 +1480,24 @@
<!-- (unsupported) -->
<!-- @@@@@@@@@@@@@@@@@@@@@@@@@@@ -->

<test compset="F2000dev" grid="ne30pg3_ne30pg3_mg17" name="ERS_Ln9_G4-a100-openacc" testmods="cam/outfrq9s_mg3_default">
<machines>
<machine name="derecho" compiler="nvhpc" category="derecho_gpu"/>
<machine name="derecho" compiler="nvhpc" category="aux_cam"/>
</machines>
<options>
<option name="wallclock">00:30:00</option>
</options>
</test>
<test compset="F2000dev" grid="ne30pg3_ne30pg3_mg17" name="ERS_Ln9_G4-a100-openacc" testmods="cam/outfrq9s_mg3_pcols760">
<machines>
<machine name="derecho" compiler="nvhpc" category="derecho_gpu"/>
<machine name="derecho" compiler="nvhpc" category="prealpha"/>
</machines>
<options>
<option name="wallclock">00:30:00</option>
</options>
</test>
<test compset="F2000dev" grid="ne30pg3_ne30pg3_mg17" name="ERP_D_Ln9" testmods="cam/outfrq9s" supported="false">
<machines>
<machine name="derecho" compiler="intel" category="aux_pumas"/>
Expand Down Expand Up @@ -1542,51 +1560,6 @@
<machine name="derecho" compiler="intel" category="camchem"/>
</machines>
</test>
<test compset="F2000dev" grid="f09_f09_mg17" name="ERP_Ln9" testmods="cam/outfrq9s_mg3_default">
<machines>
<machine name="casper" compiler="pgi-gpu" category="casper_gpu"/>
<machine name="casper" compiler="nvhpc-gpu" category="casper_gpu"/>
</machines>
<options>
<option name="wallclock">00:59:00</option>
</options>
</test>
<test compset="F2000dev" grid="f09_f09_mg17" name="ERP_Ln9" testmods="cam/outfrq9s_mg2_default">
<machines>
<machine name="casper" compiler="pgi-gpu" category="casper_gpu"/>
<machine name="casper" compiler="nvhpc-gpu" category="casper_gpu"/>
</machines>
<options>
<option name="wallclock">00:59:00</option>
</options>
</test>
<test compset="F2000dev" grid="f09_f09_mg17" name="ERP_Ln9" testmods="cam/outfrq9s_mg3_nondefault">
<machines>
<machine name="casper" compiler="pgi-gpu" category="casper_gpu"/>
<machine name="casper" compiler="nvhpc-gpu" category="casper_gpu"/>
</machines>
<options>
<option name="wallclock">00:59:00</option>
</options>
</test>
<test compset="F2000dev" grid="f09_f09_mg17" name="ERP_Ln9" testmods="cam/outfrq9s_mg3_pcols1536">
<machines>
<machine name="casper" compiler="pgi-gpu" category="casper_gpu"/>
<machine name="casper" compiler="nvhpc-gpu" category="casper_gpu"/>
</machines>
<options>
<option name="wallclock">00:59:00</option>
</options>
</test>
<test compset="F2000dev" grid="f09_f09_mg17" name="ERP_Ln9_G4" testmods="cam/outfrq9s_mg3_default">
<machines>
<machine name="casper" compiler="pgi-gpu" category="casper_gpu"/>
<machine name="casper" compiler="nvhpc-gpu" category="casper_gpu"/>
</machines>
<options>
<option name="wallclock">00:59:00</option>
</options>
</test>
<test compset="QPC4" grid="f19_f19_mg17" name="ERP_Ln9" testmods="cam/outfrq9s">
<machines>
<machine name="derecho" compiler="intel" category="test_release"/>
Expand Down

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
./xmlchange NTASKS=36
./xmlchange NTASKS=128
./xmlchange NTHRDS=1
./xmlchange ROOTPE='0'
./xmlchange ROF_NCPL=`./xmlquery --value ATM_NCPL`
Expand Down

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
./xmlchange NTASKS=36
./xmlchange NTASKS=64
./xmlchange NTHRDS=1
./xmlchange ROOTPE='0'
./xmlchange ROF_NCPL=`./xmlquery --value ATM_NCPL`
./xmlchange GLC_NCPL=`./xmlquery --value ATM_NCPL`
./xmlchange CAM_CONFIG_OPTS=' -microphys mg3 -pcols 1536' --append
./xmlchange CAM_CONFIG_OPTS=' -microphys mg3 -pcols 760 ' --append
./xmlchange TIMER_DETAIL='6'
./xmlchange TIMER_LEVEL='999'
150 changes: 150 additions & 0 deletions doc/ChangeLog
Original file line number Diff line number Diff line change
@@ -1,5 +1,155 @@
===============================================================

Tag name: cam6_4_019
Originator(s): katec, cacraig, vlarson, bstephens82, huebleruwm, zarzycki, JulioTBacmeister, jedwards4b
Date: 12 August 2024
One-line Summary: New CLUBB external, new GPU/nvhpc test suite, new CDEPS external
Github PR URL: https://github.com/ESCOMP/CAM/pull/1086

Purpose of changes (include the issue number and title text for each relevant GitHub issue):
- New CLUBB external with fixes to support GPU testing #1036
- part of cam6_4_019: Add GPU regression test suite #1048

Describe any changes made to build system: none

Describe any changes made to the namelist:
- Add default vaules for a few new CLUBB namelist parameters: clubb_bv_efold, clubb_wpxp_Ri_exp, and clubb_z_displace

List any changes to the defaults for the boundary datasets: none

Describe any substantial timing or memory changes: none

Code reviewed by: cacraigucar, sjsprecious, adamrher, bstephens82

List all files eliminated:
cime/config/testmods_dirs/cam/outfrq9s_mg3_nondefault/shell_comands
cime/config/testmods_dirs/cam/outfrq9s_mg3_nondefault/user_nl_cam
cime/config/testmods_dirs/cam/outfrq9s_mg3_nondefault/user_nl_clm
- Removed as part of GPU test updates

List all files added and what they do: None

List all existing files that have been modified, and describe the changes:
.gitmodules
- Point to new CLUBB external (clubb_4ncar_20240605_73d60f6_gpufixes_posinf)
and new CDEPS external (cdeps1.0.45)

cime/config/testdefs/testlist_cam.xml
- Add nvhpc gpu test on Derecho, remove Casper tests

cime/config/testdefs/testmods_dirs/cam/outfrq9s_mg2_default/shell_commands
cime/config/testdefs/testmods_dirs/cam/outfrq9s_mg3_default/shell_commands
- Change NTASKS for Derecho gpus

cime/config/testdefs/testmods_dirs/cam/outfrq9s_mg3_pcols1536/
- Directory renamed to cime/config/testdefs/testmods_dirs/cam/outfrq9s_mg3_pcols760
- Files updated to reflect the change

doc/ChangeLog_template
- Added space for new derecho/nvhpc required tests

src/physics/cam/clubb_intr.F90
src/physics/cam/subcol_SILHS.F90
- Updates to support the new external

test/system/archive_baseline.sh
test/system/test_driver.sh
- Updates to require CAM_FC compiler specification on Derecho (either intel or nvhpc)

If there were any failures reported from running test_driver.sh on any test
platform, and checkin with these failures has been OK'd by the gatekeeper,
then copy the lines from the td.*.status files for the failed tests to the
appropriate machine below. All failed tests must be justified.

derecho/intel/aux_cam:
ERP_Ln9.f09_f09_mg17.FCSD_HCO.derecho_intel.cam-outfrq9s (Overall: FAIL) details:
- pre-existing failure due to HEMCO not having reproducible results issues #1018 and #856

SMS_D_Ln9_P1280x1.ne0ARCTICne30x4_ne0ARCTICne30x4_mt12.FHIST.derecho_intel.cam-outfrq9s (Overall: FAIL) details:
SMS_D_Ln9_P1280x1.ne0CONUSne30x8_ne0CONUSne30x8_mt12.FCHIST.derecho_intel.cam-outfrq9s (Overall: PEND) details:
- pre-existing failures -- need fix in CLM external

SMS_D_Ln9.T42_T42.FSCAM.derecho_intel.cam-outfrq9s (Overall: FAIL) details:
- pre-existing failure -- need fix in CICE external

ERC_D_Ln9.f19_f19_mg17.QPC6.derecho_intel.cam-outfrq3s_cosp (Overall: DIFF) details:
ERC_D_Ln9_P144x1.ne16pg3_ne16pg3_mg17.QPC6HIST.derecho_intel.cam-outfrq3s_ttrac_usecase (Overall: DIFF) details:
ERP_D_Ln9.f19_f19_mg17.QPC6.derecho_intel.cam-outfrq9s (Overall: DIFF) details:
ERP_D_Ln9.ne30pg3_ne30pg3_mg17.FLTHIST.derecho_intel.cam-outfrq9s (Overall: DIFF) details:
ERP_D_Ln9.ne30pg3_ne30pg3_mg17.FLTHIST.derecho_intel.cam-outfrq9s_rrtmgp (Overall: DIFF) details:
ERP_D_Ln9_P64x2.f09_f09_mg17.QSC6.derecho_intel.cam-outfrq9s (Overall: DIFF) details:
ERP_Ld3.f09_f09_mg17.FWHIST.derecho_intel.cam-reduced_hist1d (Overall: DIFF) details:
ERP_Ln9.C96_C96_mg17.F2000climo.derecho_intel.cam-outfrq9s_mg3 (Overall: DIFF) details:
ERP_Ln9.f09_f09_mg17.F1850.derecho_intel.cam-outfrq9s (Overall: DIFF) details:
ERP_Ln9.f09_f09_mg17.F2000climo.derecho_intel.cam-outfrq9s (Overall: DIFF) details:
ERP_Ln9.f09_f09_mg17.F2010climo.derecho_intel.cam-outfrq9s (Overall: DIFF) details:
ERP_Ln9.f09_f09_mg17.FHIST_BDRD.derecho_intel.cam-outfrq9s (Overall: DIFF) details:
ERP_Ln9.f19_f19_mg17.FWsc1850.derecho_intel.cam-outfrq9s (Overall: DIFF) details:
ERP_Ln9.ne30pg3_ne30pg3_mg17.FCnudged.derecho_intel.cam-outfrq9s (Overall: DIFF) details:
ERP_Ln9.ne30pg3_ne30pg3_mg17.FW2000climo.derecho_intel.cam-outfrq9s (Overall: DIFF) details:
ERP_Ln9_P24x3.f45_f45_mg37.QPWmaC6.derecho_intel.cam-outfrq9s_mee_fluxes (Overall: DIFF) details:
ERS_Ld3.f10_f10_mg37.F1850.derecho_intel.cam-outfrq1d_14dec_ghg_cam7 (Overall: DIFF) details:
ERS_Ln9.f09_f09_mg17.FX2000.derecho_intel.cam-outfrq9s (Overall: DIFF) details:
ERS_Ln9.f19_f19_mg17.FXSD.derecho_intel.cam-outfrq9s (Overall: DIFF) details:
ERS_Ln9_P288x1.mpasa120_mpasa120.F2000climo.derecho_intel.cam-outfrq9s_mpasa120 (Overall: DIFF) details:
ERS_Ln9_P36x1.mpasa480_mpasa480.F2000climo.derecho_intel.cam-outfrq9s_mpasa480 (Overall: DIFF) details:
SMS_D_Ln9.f09_f09_mg17.FCts2nudged.derecho_intel.cam-outfrq9s_leapday (Overall: DIFF) details:
SMS_D_Ln9.f09_f09_mg17.FCvbsxHIST.derecho_intel.cam-outfrq9s (Overall: DIFF) details:
SMS_D_Ln9.f09_f09_mg17.FSD.derecho_intel.cam-outfrq9s (Overall: DIFF) details:
SMS_D_Ln9.f19_f19_mg17.FWma2000climo.derecho_intel.cam-outfrq9s (Overall: DIFF) details:
SMS_D_Ln9.f19_f19_mg17.FWma2000climo.derecho_intel.cam-outfrq9s_waccm_ma_mam4 (Overall: DIFF) details:
SMS_D_Ln9.f19_f19_mg17.FXHIST.derecho_intel.cam-outfrq9s_amie (Overall: DIFF) details:
SMS_D_Ln9.f19_f19_mg17.QPC2000climo.derecho_intel.cam-outfrq3s_usecase (Overall: DIFF) details:
SMS_D_Ln9.ne16pg3_ne16pg3_mg17.FX2000.derecho_intel.cam-outfrq9s (Overall: DIFF) details:
SMS_D_Ln9.ne30pg3_ne30pg3_mg17.FCts4MTHIST.derecho_intel.cam-outfrq9s (Overall: DIFF) details:
SMS_D_Ln9.ne30pg3_ne30pg3_mg17.FMTHIST.derecho_intel.cam-outfrq9s (Overall: DIFF) details:
SMS_D_Ln9_P1280x1.ne30pg3_ne30pg3_mg17.FCLTHIST.derecho_intel.cam-outfrq9s (Overall: DIFF) details:
SMS_Ld1.f09_f09_mg17.FCHIST_GC.derecho_intel.cam-outfrq1d (Overall: DIFF) details:
SMS_Ld1.f09_f09_mg17.FW2000climo.derecho_intel.cam-outfrq1d (Overall: DIFF) details:
SMS_Ld1.ne30pg3_ne30pg3_mg17.FC2010climo.derecho_intel.cam-outfrq1d (Overall: DIFF) details:
SMS_Lh12.f09_f09_mg17.FCSD_HCO.derecho_intel.cam-outfrq3h (Overall: DIFF) details:
SMS_Lm13.f10_f10_mg37.F2000climo.derecho_intel.cam-outfrq1m (Overall: DIFF) details:
SMS_Ln9.f09_f09_mg17.F2010climo.derecho_intel.cam-nudging (Overall: DIFF) details:
SMS_Ln9.f09_f09_mg17.FW1850.derecho_intel.cam-reduced_hist3s (Overall: DIFF) details:
SMS_Ln9.f19_f19.F2000climo.derecho_intel.cam-silhs (Overall: DIFF) details:
SMS_Ln9.f19_f19_mg17.FHIST.derecho_intel.cam-outfrq9s_nochem (Overall: DIFF) details:
SMS_Ln9.ne30pg3_ne30pg3_mg17.FW2000climo.derecho_intel.cam-outfrq9s_rrtmgp (Overall: DIFF) details:
- Expected differences due to the new CLUBB external (See PR for discussion)

derecho/nvphc/aux_cam:

ERS_Ln9_G4-a100-openacc.ne30pg3_ne30pg3_mg17.F2000dev.derecho_nvhpc.cam-outfrq9s_mg3_default (Overall: DIFF)
FAIL ERS_Ln9_G4-a100-openacc.ne30pg3_ne30pg3_mg17.F2000dev.derecho_nvhpc.cam-outfrq9s_mg3_default BASELINE /glade/campaign/cesm/community/amwg/cam_baselines/cam6_4_018_intel: ERROR BFAIL baseline directory '/glade/campaign/cesm/community/amwg/cam_baselines/cam6_4_018_intel/ERS_Ln9_G4-a100-openacc.ne30pg3_ne30pg3_mg17.F2000dev.derecho_nvhpc.cam-outfrq9s_mg3_default' does not exist
- Expected baseline compare fail due to no baselines stored for GPU tests that didn't exist previously

izumi/nag/aux_cam:
DAE.f45_f45_mg37.FHS94.izumi_nag.cam-dae (Overall: FAIL) details:
- pre-existing failure - issue #670

ERC_D_Ln9.f10_f10_mg37.QPC6.izumi_nag.cam-outfrq3s_am (Overall: DIFF) details:
ERC_D_Ln9.f10_f10_mg37.QPC6.izumi_nag.cam-outfrq3s_cospsathist (Overall: DIFF) details:
ERC_D_Ln9.f10_f10_mg37.QPC6.izumi_nag.cam-outfrq3s (Overall: DIFF) details:
ERC_D_Ln9.f10_f10_mg37.QPWmaC6.izumi_nag.cam-outfrq3s (Overall: DIFF) details:
ERI_D_Ln18.f19_f19_mg17.QPC6.izumi_nag.cam-ghgrmp_e8 (Overall: DIFF) details:
ERP_Ln9.ne5pg3_ne5pg3_mg37.QPC6.izumi_nag.cam-outfrq9s_clubbmf (Overall: DIFF) details:
SMS_D_Ln9.f10_f10_mg37.QPC6.izumi_nag.cam-outfrq3s_ba (Overall: DIFF) details:
SMS_P48x1_D_Ln3.f09_f09_mg17.QPC6HIST.izumi_nag.cam-outfrq3s_co2cycle_usecase (Overall: DIFF) details:
- Expected differences due to the new CLUBB external (See PR for discussion)

izumi/gnu/aux_cam:
ERP_D_Ln9.C48_C48_mg17.QPC6.izumi_gnu.cam-outfrq9s (Overall: DIFF) details:
ERP_D_Ln9.ne3pg3_ne3pg3_mg37.QPC6.izumi_gnu.cam-outfrq9s_rrtmgp (Overall: DIFF) details:
- Expected differences due to the new CLUBB external (See PR for discussion)

CAM tag used for the baseline comparison tests if different than previous
tag: cam6_4_018

Summarize any changes to answers:
All compsets that use CLUBB (cam6+) will have slight answer changes. Discussion in PR.
Nvhpc gpu tests have no stored baseline for comparison.

===============================================================

Tag name: cam6_4_018
Originator(s): peverwhee, jedwards4b
Date: 30 July 2024
Expand Down
Loading

0 comments on commit eb27509

Please sign in to comment.