-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better automate variable derivations in post-processing workflows #605
Comments
Since the |
SymPy is a symbolic math library for Python |
https://github.com/E3SM-Project/e3sm_diags/blob/main/e3sm_diags/derivations/acme.py seems to be composed of more or less the following sections: L619-2161 (the derived variables dict) is an dictionary mapping variables (as strings) to ordered dictionaries mapping variables (as strings) to functions. I'm assuming by using ordered dictionaries, the code will then go through the possible substitutions in that prescribed order. The logic of deriving variables actually extends further into https://github.com/E3SM-Project/e3sm_diags/blob/main/e3sm_diags/e3sm_diags_vars.py This block almost makes it look like we'd need all possible base variables present in the user's file (i.e., there's no filtering on
|
I feel like a recursive approach as in https://github.com/E3SM-Project/zppy/blob/main/zppy/templates/readTS.py would be the cleanest. It would be easier to follow than the derived variable dictionary. However, short of re-implementing the entire derivation code to check, I'm not sure it would fully cover everything. def get_var(var_name: str, defined_vars: Dict[str, var]) -> var:
if var_name in defined_vars:
return defined_vars[var_name]
elif var_name == "PRECT":
pr = get_var("pr", defined_vars)
if pr:
return(qflxconvert_units(pr))
# Try second derivation method
precc = get_var("PRECC")
precl = get_var("PRECL")
if precc and precl:
return prect(precc, precl)
# Try third derivation method
...
else:
# Could not define the variable
return None It's possible the third-party symbolic algebra package would be the cleanest solution. I suppose we could try to define the variables as symbols in SymPy and work from there, but we may have too much going on here -- names of variables, and also their values and units. |
@xylar Do you know of any packages or algorithms that would handle something like this well? (This is a lower-priority item; it's just something that has come up a few times now as being potentially useful). Or maybe option (1)/(2) below would be the better path forward?
|
@forsyth2, thanks for pinging me on this. I don't have any experience with this myself. I haven't tried to allow users to define their own new products and such. |
Request criteria
Issue description
Currently, variable derivations are handled on a per-package basis. For example, in the
global_time_series
task, the derivations are handled in https://github.com/E3SM-Project/zppy/blob/main/zppy/templates/readTS.py and in thee3sm_diags
package, the derivations are handled in https://github.com/E3SM-Project/e3sm_diags/blob/main/e3sm_diags/derivations/acme.py.It would make more sense for derivations to be handled uniformly. Possible options:
e3sm_diags
package and theglobal_time_series
zppy task would both call this new package to derive it from the given data.It's possible a generic package (e.g., a symbolic/computer algebra library) could accomplish (3) without much extra work from us.
The text was updated successfully, but these errors were encountered: