generate ADF config file #142

brianpm · 2024-10-14T21:03:54Z

New Feature Submissions:

Does your submission pass tests?
Have you lint your code locally prior to submission?

Not sure about tests... but yes, linted code locally.

Changes to Core Features:

Have you added an explanation of what your changes do and why you'd like us to include them?
Have you successfully tested your changes locally?

This adds a script that is a first attempt to take the CUPID config file to generate an ADF config file. The script works locally.

The idea is to be able to run the ADF without the user needing to manually edit the ADF config file.

I think we need to discuss what are the requirements for such an input file. It seems like ADF is expecting more information than CUPID right now... which is fine, but might mean that this script needs to make quite a few assumptions along the way.

brianpm · 2024-10-23T21:18:55Z

I just pushed some updates (mostly untested). I think this gets us pretty close to a version that will run ADF.

This script would be run as (for example):
python generate_adf_config_file.py --cupid_file ~/Code/CUPiD/examples/key_metrics/config.yml --adf_template ~/Code/CAM_diagnostics/config_amwg_default_plots.yaml --out_file ~/Desktop/testfile2.yaml

Then, in principle, ADF would be run like:

> cd $ADFSOURCE
> ./run_adf_diag ~/Desktop/testfile2.yaml

Note that I am pretty sure the path for ADF output won't work yet. I set it to CUPiD's "sname", but I think it needs to be expanded to be a full path.

TeaganKing · 2024-11-04T21:03:55Z

These changes were made:

a few minor formatting tweaks
adjusted the path to ADF output based on Create a new example that will run full ADF package (not from notebook) #141
updated the file paths to include the base or test case name after the CESM output dir
specified a full path based on the cupid_file path for outputting plots (otherwise this directory may not be found from the ADF root)
updated the directory structure in the config file to casename/atm/proc/tseries instead of casename/proc/tseries (we could alternatively update how we are doing this in CUPiD)

A few notes:

I was able to create an ADF config file with a similar command as the one you shared above (just using my own files) and started running ADF in the cupid-analysis environment.
While this did start running and generated timeseries and climo files after hist_str was manually updated, it appears to be hanging before plots finish generating and ADF runs to completion.

A few questions:

ADF does not seem to find all of the files needed for input because hist_str is set to cam.h0 and the adf lib expects to find data with the following format (note the period after h0!), but only finds cam.h0i files. We should probably update this, but I'm not sure if there's a consistent format we'd expect?
/glade/campaign/cesm/development/cross-wg/diagnostic_framework/CESM_output_for_testing/b.e23_alpha17f.BLT1850.ne30_t232.092/atm/hist/*cam.h0.*
ADF expects a top level 'basic diag info' hist_str (which matches the current cupid config setup), but there are case-specific hist_str included in the adf config file (it seems like theses may not be used). Can we remove these?
Do we also want to include just a subset of variables to get started, or is it best to run timeseries, etc, for all of them?

TeaganKing · 2024-11-04T22:59:27Z

@justin-richling or @brianpm if either of you have suggestions regarding what our default hist_str should be and whether this should be global or case-specific (see comment above), that would be helpful! FYI I'll continue working on this on Wednesday afternoon and could implement any relevant suggestions at that time.

…and add atm in proc/tseries paths

TeaganKing · 2024-11-07T17:32:03Z

Some updates on plot generation: the hanging after generating ts and climo files is related to processing the 3D variables. There also is a KeyError when looking for lat while generating zonal mean plots.

TeaganKing · 2024-11-08T23:05:22Z

The default config file for ADF has been updated (thanks to Justin!) and now the hist_str is just set to h0a to use averaged fields.

When running just PRECT and using an updated case for comparison (b.e30_beta02.BLT1850.ne30_t232.104 vs f.e23_alpha17f.FLTHIST_ne30.roughtopo.099 -- this is updated in the config file in #141 ), this runs fine and doesn't hang. It looks like that was a grid-related issue.

TeaganKing · 2024-11-12T18:29:07Z

A variable list could be added to the cupid config file that overwrites the default ADF variables.

TeaganKing · 2024-11-13T18:37:11Z

The external_diag_packages config.yml in #141 has been updated to include same-grid cases and a list of adf_vars which now can be used by this helper script (but is not necessary so that the script is compatible with other yaml files).

mnlevy1981

This looks great! I'm still putting together a test case to try it out myself, but it sounds like @TeaganKing has already done that so we are confident the script works as advertised :)

Two last consistency points:

generate_cupid_config_for_cesm_case.py is executable, and my inclination is that generate_adf_config_file.py should be as well (but I'm open to the idea of requiring python generate_cupid_config_for_cesm_case.py instead of allowing ./generate_cupid_config_for_cesm_case.py if there is a good reason to avoid the latter)
The scripts in cupid/ all use click instead of argparse and the scripts in this directory probably should as well. That's best handled in a separate PR, I just wanted to make a note here while I'm thinking about it.

helper_scripts/generate_adf_config_file.py

mnlevy1981 · 2024-11-15T18:21:01Z

helper_scripts/generate_adf_config_file.py

+
+    # Set case names for ADF config
+    a_dict["diag_cam_climo"]["cam_case_name"] = test_case_name
+    a_dict["diag_cam_baseline_climo"]["cam_case_name"] = base_case_name


Is it okay for this value to be None? Do we need something like

if base_case_name: a_dict["diag_cam_baseline_climo"]["cam_case_name"] = base_case_name elif "cam_case_name" in "cam_case_name": del a_dict["diag_cam_baseline_climo"]["cam_case_name"]

Following up because I think the github formatting made it hard to see the context of my comment. This is specifically referring to a_dict["diag_cam_baseline_climo"]["cam_case_name"] = base_case_name, since base_case_name = c_dict["global_params"].get("base_case_name") -- what is the expected behavior if base_case_name is not in global_params?

In the ADF config file, compare_obs determines whether the model run is compared to observations or another model run. If this variable is false in the ADF config file, then diag_cam_baseline_climo is used and is a required config file section. If not present, ADF fails.

I'm curious if we would want to support comparison to obs in addition to model-model comparison-- @brianpm and @justin-richling do you have thoughts on this?

With all that said, I think that we could just use base_case_name = c_dict["global_params"]["base_case_name"] instead of the .get() since it is an expected value for model-model comparisons. I'll update this now.

I think we should allow for the model vs obs comparisons. In adf_info.py there is the variable data_name that gets set for the baseline case name if model vs model, but data_name gets set to "Obs" if its a model vs obs case.

I'm a bit hesitant to include this feature within this PR if we want to get this in before Wednesday. I think it will require an additional default configuration file from ADF as well as a number of additional parameters in the cupid configuration file (including compare_obs) as well as parameters that are added to the script to generate the ADF config file.

See issue #150

helper_scripts/generate_adf_config_file.py

mnlevy1981 · 2024-11-15T18:26:19Z

helper_scripts/generate_adf_config_file.py

+    base_case_cupid_ts_index = (
+        ts_case_names.index(base_case_name) if base_case_name in ts_case_names else None
+    )
+    if base_case_name is not None:


if base_case_name: would be consistent with the ts_case_names check above:

ts_case_names = c_ts.get("case_name") if not ts_case_names: raise ValueError("CUPiD file does not have timeseries case_name array.")

This if statement is not necessary if we expect base_case_name to always be in the CUPiD config

Just updated this

helper_scripts/generate_adf_config_file.py

TeaganKing · 2024-11-15T20:11:47Z

Thanks for these suggestions, @mnlevy1981 ! I think I have addressed everything (with a few notes in response to your direct in-line comments); let me know if you have any further requests.

Instead of looking for a variable list in compute_notebooks.atm.link_to_ADF.parameter_groups.none.adf_vars Loop through all compute_notebooks.${component}.${notebook} and look for compute_notebooks.${component}.${notebook}.external_tool if external_tool.tool_name is 'ADF', append everything in external_tool.vars to diag_var_list. Also append everything in external_tool.plotting_scripts to plotting_scripts (to make it easier to control what plots get made). for both cases, if variable isn't found in any external_tool then the default list values from ADF will be used.

Didn't realize pre-commit hadn't run on my last commit

helper_scripts/generate_adf_config_file.py

mnlevy1981 · 2024-11-18T16:04:36Z

helper_scripts/generate_adf_config_file.py

+    if base_case_name:
+        a_dict["diag_cam_baseline_climo"]["cam_case_name"] = base_case_name
+
+    elif "cam_case_name" in "cam_case_name":
+        del a_dict["diag_cam_baseline_climo"]["cam_case_name"]


Since you dropped the .get() for base_case_name, I think we want to drop the if/else and just keep line 87

mnlevy1981

This looks great! I think all that's left is to hear @brianpm and @justin-richling's thoughts on #142 (comment) and then update a few spots accordingly.

mnlevy1981 · 2024-11-18T16:45:10Z

helper_scripts/generate_adf_config_file.py

+    if "cam_case_name" in "cam_case_name":
        del a_dict["diag_cam_baseline_climo"]["cam_case_name"]


I think we want to remove these two lines. The if statement isn't posed very well (it should probably be if "cam_case_name" in a_dict["diag_cam_baseline_climo"]:), but it was originally meant to catch the case where a_dict["diag_cam_baseline_climo"]["cam_case_name"] was read in from the template file and we wanted to remove it because base_case_name wasn't defined. Now we are always setting a_dict["diag_cam_baseline_climo"]["cam_case_name"] = base_case_name so we don't want to remove the dictionary entry.

brianpm and others added 6 commits October 14, 2024 14:57

really adding the script this time (??)

fd4db19

added newline at end of file per linting suggestion

97565d0

just more linting/formatting on my end

bd92dc1

add more details for ADF config

49a0f99

Merge branch 'NCAR:main' into gen_adf_config

c9759dc

fix typo for sname

13c0620

TeaganKing added 3 commits November 4, 2024 12:55

ran pre-commit

eb25a31

minor change & note sname portion

9fd4ca9

linting suggestion from after commit

9ed14fd

TeaganKing mentioned this pull request Nov 4, 2024

Create a new example that will run full ADF package (not from notebook) #141

Merged

6 tasks

TeaganKing added 2 commits November 4, 2024 13:30

update path to adf

9ba9d6d

include case name and not just CESM output dir

ab8d539

included hist_str for top level rather than case-specific

db5a6f9

updates to config file generation to include full path to ADF output …

db47343

…and add atm in proc/tseries paths

include _build/html in path

35a9ebb

TeaganKing assigned TeaganKing and brianpm Nov 8, 2024

TeaganKing added the atm label Nov 8, 2024

include option for overwriting vars in adf config file

8f79704

mnlevy1981 requested changes Nov 15, 2024

View reviewed changes

TeaganKing mentioned this pull request Nov 15, 2024

Use click in helper scripts directory #149

Open

TeaganKing added 2 commits November 15, 2024 12:26

initial response to review comments

0388b54

additional review updates

d8a563f

mnlevy1981 added 2 commits November 15, 2024 16:02

Pass CI tests

605a59c

Didn't realize pre-commit hadn't run on my last commit

mnlevy1981 reviewed Nov 16, 2024

View reviewed changes

helper_scripts/generate_adf_config_file.py Outdated Show resolved Hide resolved

TeaganKing added 2 commits November 18, 2024 08:31

update hist_str line

a2308d7

remove .get from base_case_name

677a668

TeaganKing requested a review from mnlevy1981 November 18, 2024 15:53

mnlevy1981 reviewed Nov 18, 2024

View reviewed changes

mnlevy1981 requested changes Nov 18, 2024

View reviewed changes

update if statements for base case name

101e1ac

mnlevy1981 reviewed Nov 18, 2024

View reviewed changes

remove if statement cam_case_name

750e005

TeaganKing mentioned this pull request Nov 18, 2024

Include capabilities for adding ADF comparisons with observations #150

Open

TeaganKing marked this pull request as ready for review November 18, 2024 17:48

mnlevy1981 approved these changes Nov 18, 2024

View reviewed changes

TeaganKing merged commit b9e0964 into NCAR:main Nov 18, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

generate ADF config file #142

generate ADF config file #142

brianpm commented Oct 14, 2024 •

edited by TeaganKing

Loading

brianpm commented Oct 23, 2024

TeaganKing commented Nov 4, 2024 •

edited

Loading

TeaganKing commented Nov 4, 2024 •

edited

Loading

TeaganKing commented Nov 7, 2024

TeaganKing commented Nov 8, 2024 •

edited

Loading

TeaganKing commented Nov 12, 2024

TeaganKing commented Nov 13, 2024

mnlevy1981 left a comment

mnlevy1981 Nov 15, 2024

mnlevy1981 Nov 16, 2024

TeaganKing Nov 18, 2024

justin-richling Nov 18, 2024

TeaganKing Nov 18, 2024

TeaganKing Nov 18, 2024

mnlevy1981 Nov 15, 2024

mnlevy1981 Nov 18, 2024

TeaganKing Nov 18, 2024

TeaganKing commented Nov 15, 2024

mnlevy1981 Nov 18, 2024

mnlevy1981 left a comment

mnlevy1981 Nov 18, 2024

TeaganKing Nov 18, 2024

		if "cam_case_name" in "cam_case_name":
		del a_dict["diag_cam_baseline_climo"]["cam_case_name"]

generate ADF config file #142

generate ADF config file #142

Conversation

brianpm commented Oct 14, 2024 • edited by TeaganKing Loading

New Feature Submissions:

Changes to Core Features:

brianpm commented Oct 23, 2024

TeaganKing commented Nov 4, 2024 • edited Loading

TeaganKing commented Nov 4, 2024 • edited Loading

TeaganKing commented Nov 7, 2024

TeaganKing commented Nov 8, 2024 • edited Loading

TeaganKing commented Nov 12, 2024

TeaganKing commented Nov 13, 2024

mnlevy1981 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TeaganKing commented Nov 15, 2024

Choose a reason for hiding this comment

mnlevy1981 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brianpm commented Oct 14, 2024 •

edited by TeaganKing

Loading

TeaganKing commented Nov 4, 2024 •

edited

Loading

TeaganKing commented Nov 4, 2024 •

edited

Loading

TeaganKing commented Nov 8, 2024 •

edited

Loading