You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A pretty specific issue here: when extracting data with xs.extract_dataset, the attributes within the resulting dataset are labeled "intake_esm_attrs:..." rather than "cat:..." when extraction is done within a Dask client, and within JupyterLab. Doing the same thing but running from within a script produces the correct attribute names. Running outside of a Dask client works as normal, both in a script and in a Notebook.
Steps To Reproduce
Simple example. Attributes are correct for this version (e.g.: 'cat:id':'CMIP6_ScenarioMIP_CCCma_CanESM5_ssp370_r1i1p1f1_global'):
However, if xs.extract_dataset is wrapped with a Dask client, attributes keys are incorrect (e.g., 'intake_esm_attrs:id': 'CMIP6_ScenarioMIP_CCCma_CanESM5_ssp370_r1i1p1f1_global'):
Seems related to how the client workers are created and then on how the dask functions are pickled and sent to them.
In a script, this happens if the xscen import is done after if __name__ == '__main__'. We don't do that usually of course. However, in a notebook, all code is executed after the equivalent.
From the answer I got on StackOverflow, I don't see any easy xscen-level solution. The issue is that intake-esm makes use of a global state variable (the options) within a dask Delayed function, although « Dask is generally aiming to be functional/stateless, so that each function call produces results based only on the arguments it is supplied ».
Passing the options as an argument to the Delayed would fix that. On our side, the Client.run hack seems to be good enough.
Setup Information
Description
A pretty specific issue here: when extracting data with xs.extract_dataset, the attributes within the resulting dataset are labeled "intake_esm_attrs:..." rather than "cat:..." when extraction is done within a Dask client, and within JupyterLab. Doing the same thing but running from within a script produces the correct attribute names. Running outside of a Dask client works as normal, both in a script and in a Notebook.
Steps To Reproduce
Simple example. Attributes are correct for this version (e.g.: 'cat:id':'CMIP6_ScenarioMIP_CCCma_CanESM5_ssp370_r1i1p1f1_global'):
However, if xs.extract_dataset is wrapped with a Dask client, attributes keys are incorrect (e.g., 'intake_esm_attrs:id': 'CMIP6_ScenarioMIP_CCCma_CanESM5_ssp370_r1i1p1f1_global'):
Additional context
Workaround suggested by @aulemahal :
Before xs.extract_dataset, after the "with Client(...) as client:" insert
client.run(lambda: xs.__version__)
.Contribution
The text was updated successfully, but these errors were encountered: