Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding EOOffshore CMEMS ASCAT REP and NRT recipes #183

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

derekocallaghan
Copy link
Contributor

These recipes will create two data sets containing Copernicus Marine Service (CMS), or Copernicus Marine Environment Monitoring Service (CMEMS) Reprocessed (2007 - 2021) and Near Real Time (2016 - 2021) Metop-A/B/C ASCAT Ascending/Descending products, for the Irish Continental Shelf (ICS) region, where the original time coordinates have been replaced with the satellite/pass measurement_time values. Products are retrieved from the following source data sets:

The recipes will recreate the ASCAT data sets used in the EOOffshore project (https://eooffshore.github.io), whose outputs were presented (Scalable Offshore Wind Analysis With Pangeo) at the Meeting Exascale Computing Challenges with Compression and Pangeo 2022 EGU General Assembly session.

Example usage of the ASCAT data sets in EOOffshore:

Note:

  • Pruned versions of both recipes have been successfully tested using a local installation of pangeo_forge_recipes, version 0.9.1.
  • Products are retrieved using FTP, where the credentials specified here are used.
  • The time coordinates provided by ASCAT products are replaced with a corresponding ICS spatial mean of the measurement_time variable. However, this means that the file pattern URLs need to be sorted prior to subseqent Zarr store creation. This has been achieved by subclassing XarrayZarrRecipe and replacing both self.file_pattern and self._compiler, where the latter function inserts an additional Stage that performs the sort operation into the Pipeline.

Copy link
Member

@cisaacstern cisaacstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@derekocallaghan, this is certainly a very clever way to introduce a custom Stage.

In forthcoming releases, pangeo-forge-recipes will support arbitrary stages. Until then, something like this (or chaining two recipes) is probably the only path for customization.

We don't need the setup_logging import & call here. Once you've accepted the below suggestions removing them, we can try a test run of this recipe on Pangeo Forge Cloud.

@cisaacstern
Copy link
Member

/run eooffshore_ics_cmems_WIND_GLO_WIND_L3_REP_OBSERVATIONS_012_005_MetOp_ASCAT

@cisaacstern
Copy link
Member

/run eooffshore_ics_cmems_WIND_GLO_WIND_L3_NRT_OBSERVATIONS_012_002_MetOp_ASCAT

@cisaacstern
Copy link
Member

cisaacstern commented Sep 19, 2022

I noticed the /run calls above errored on the backend. Here are some notes on why, and what we can do about it:

  • The error was:
    File "/pangeo_forge_recipes/serialization.py", line 20, in either_encode_or_hash
    raise TypeError(f"object of type {type(obj).__name__} not serializable"
    
  • This is because the backend is currently running pangeo-forge-recipes==0.9.0, which does not have this function serialization code. (The meta.yaml correctly specifies a requirement for 0.9.1, but the ability to dynamically use that pangeo-forge-recipes version on the backend does not yet exist.)
  • As a first step then, we know we need to upgrade to 0.9.1 on the backend.
  • But once that's complete, we'll immediately hit another issue. The latest release of pangeo-forge-runner, version 0.5, which the backend uses for deploying jobs, evaluates recipe code in such a way that inspect.get_source does not work, and will raise OSError: could not get source code. I know @yuvipanda is aware of this (we were discussing it last week), but I'm not clear if this issue is fixed in an unreleased version of pangeo-forge-runner, or if it remains unresolved.

So, to recap, we need to:

  • Upgrade backend to 0.9.1
  • Release a version of pangeo-forge-runner which can call inspect.get_source without erroring. (Either by simply releasing from the current main, if it's fixed there, or making a PR, and then releasing.)

and then we can re-run these tests.

@cisaacstern cisaacstern added blocked:backend-release Requires a new release of the orchestrator backend. blocked:runner-release Requires a new release of pangeo-forge-runner. labels Sep 19, 2022
@derekocallaghan
Copy link
Contributor Author

Thanks for the update @cisaacstern. Should I look into getting the recipe to work with a local copy of 0.9.0, or is it best to stick with 0.9.1 and wait until after the upgrade/release?

@cisaacstern
Copy link
Member

@derekocallaghan, let's stick with 0.9.1 and wait until upgrade/release to re-run.

@cisaacstern
Copy link
Member

I've opened pangeo-forge/pangeo-forge-runner#31 to track adding tests to pangeo-forge-runner for the blocking issue here.

@andersy005
Copy link
Member

pre-commit.ci autofix

@derekocallaghan
Copy link
Contributor Author

Hi @cisaacstern, it looks like the pre-commit autofix didn't work as the PR was made from an organization, similar to #145 (comment)

@derekocallaghan
Copy link
Contributor Author

Hi @cisaacstern, I don't think these recipes have been run yet, but I wanted to confirm that all imports required by the process_input function are in the function body, similar to this PR recipe: #145 (comment)

It looks like the current process_ds() function should be okay. My only question is whether the function type hints will cause any problems? I can remove these if required, thanks.

def process_ds(
    ds: xr.Dataset, fname: str, satellites: List[str], service: str
) -> xr.Dataset:

@cisaacstern
Copy link
Member

cisaacstern commented Oct 4, 2022

@derekocallaghan, thanks for checking in. Since we last looked at this, the two TODO items in #183 (comment) are now complete. I'll take a moment here to explain how I confirmed that, in the event it's useful to @andersy005 or other admins to see the thought process. Then after this comment, I'll trigger tests of your recipes here. The TODO items were:

  1. Upgrade backend to 0.9.1. To confirm this, I opened a shell on our production server, and ran:

    Note: Only Pangeo Forge Cloud admins will be able to open a shell on the production server like this.

    $ heroku run /bin/bash -a pangeo-forge-api-prod
    Running /bin/bash on ⬢ pangeo-forge-api-prod... up, run.1629 (Hobby)
    ~ $ pip show pangeo-forge-recipes
    Name: pangeo-forge-recipes
    Version: 0.9.1
    Summary: Pipeline tools for building and publishing analysis ready datasets.
    Home-page: https://github.com/pangeo-forge/pangeo-forge-recipes
    Author: 
    Author-email: 
    License: Apache
    Location: /usr/local/lib/python3.9/dist-packages
    Requires: dask, distributed, fsspec, h5netcdf, h5py, intake, intake-xarray, kerchunk, mypy-extensions, netcdf4, numcodecs, setuptools, xarray, zarr
    Required-by: pangeo-forge-runner
  2. Release a version of pangeo-forge-runner which can call inspect.get_source without erroring. To test this, I ran the latest release of pangeo-forge-runner==0.6.1 from my local terminal:
    $ pangeo-forge-runner expand-meta --repo=https://github.com/eooffshore/staged-recipes --ref=d33441bfa8d3d950a78b0deea02d9da982a3d01a --json --feedstock-subdir=recipes/eooffshore_ics_cms_WIND_GLO_WIND_L3_OBSERVATIONS_Metop_ASCAT
    {"message": "Picked Git content provider.\n", "status": "fetching"}
    {"message": "Cloning into '/var/folders/tt/4f941hdn0zq549zdwhcgg98c0000gn/T/tmpr55acjum'...\n", "status": "fetching"}
    {"message": "HEAD is now at d33441b Fixed typos in metadata (recipe runs are currently postponed as described above)\n", "status": "fetching"}
    {"message": "Expansion complete", "status": "completed", "meta": {"title": "eooffshore_ics_cms_WIND_GLO_WIND_L3_OBSERVATIONS_Metop_ASCAT", "description": "EOOffshore Project (https://eooffshore.github.io) - Copernicus Marine Service Reprocessed (2007 - 2021) and Near Real Time (2016 - 2021) Metop-A/B/C ASCAT Ascending/Descending products, for the Irish Continental Shelf region, where the original time coordinates have been replaced with the satellite/pass measurement_time values.", "pangeo_forge_version": "0.9.1", "pangeo_notebook_version": "2022.06.02", "recipes": [{"id": "eooffshore_ics_cmems_WIND_GLO_WIND_L3_REP_OBSERVATIONS_012_005_Metop_ASCAT", "object": "recipe:rep_recipe"}, {"id": "eooffshore_ics_cmems_WIND_GLO_WIND_L3_NRT_OBSERVATIONS_012_002_Metop_ASCAT", "object": "recipe:nrt_recipe"}], "provenance": {"providers": [{"name": "Copernicus Marine Service (CMS)", "description": "The Copernicus Marine Service (CMS), or Copernicus Marine Environment Monitoring Service (CMEMS), is the marine component of the European Union Copernicus Earth Observation (EO) programme, providing free, regular and systematic ocean data products on a global and regional scale.", "roles": ["producer", "licensor"], "url": "https://marine.copernicus.eu/"}], "license": "Generated using E.U. Copernicus Marine Service Information; https://doi.org/10.48670/moi-00182; https://doi.org/10.48670/moi-00183;"}, "maintainers": [{"name": "Derek O'Callaghan", "orcid": "0000-0003-0585-8631", "github": "derekocallaghan"}], "bakery": {"id": "pangeo-ldeo-nsf-earthcube"}}}
    The fact that this call did not error tells us that this problem is resolved in pangeo-forge-runner==0.6.1, which is the version on the latest deployment of the backend server.

@cisaacstern
Copy link
Member

/run eooffshore_ics_cmems_WIND_GLO_WIND_L3_REP_OBSERVATIONS_012_005_Metop_ASCAT

@cisaacstern
Copy link
Member

/run eooffshore_ics_cmems_WIND_GLO_WIND_L3_NRT_OBSERVATIONS_012_002_Metop_ASCAT

@derekocallaghan
Copy link
Contributor Author

Thanks for the update @cisaacstern, and for triggering the recipe tests. I see that they were started and have been running for a few hours now. I'm assuming that there's a problem with the recipes as the pruned version normally runs in a couple of minutes locally. I don't see any logs, would you be able to check on your side? Thanks.

@derekocallaghan
Copy link
Contributor Author

Thanks for the update @cisaacstern, and for triggering the recipe tests. I see that they were started and have been running for a few hours now. I'm assuming that there's a problem with the recipes as the pruned version normally runs in a couple of minutes locally. I don't see any logs, would you be able to check on your side? Thanks.

* https://pangeo-forge.org/dashboard/recipe-run/1094?feedstock_id=1

* https://pangeo-forge.org/dashboard/recipe-run/1095?feedstock_id=1

Hi @cisaacstern, I just saw your reply regarding logs to a separate PR, so no problem if they're not readily available for these ASCAT recipes, thanks.

@cisaacstern
Copy link
Member

@derekocallaghan thanks for checking in. I've taken a look on the backend and I actually do not see these jobs registered on Google Cloud Dataflow. This tells me there was likely some problem with the recipe parsing and/or job submission process. As it happens, I am beginning a roughly monthlong family leave tomorrow, however it's possible that another maintainer will be able to help debug this in my absence, in which case my recommendation would be for a maintainer to re-trigger these test runs while tracking the backend server logs, as described in:

https://github.com/pangeo-forge/pangeo-forge-orchestrator/tree/main/docs#heroku-server-logs

This may provide some hint of the problem. If the problem is at the pangeo-forge-runner layer, a further debugging step, as described in pangeo-forge/pangeo-forge-orchestrator#148 (comment) may be necessary.

@rabernat
Copy link
Contributor

Hi @yuvipanda - any chance you might be able to help us debug this recipe run in Charles' absence?

@derekocallaghan
Copy link
Contributor Author

I've been trying to run these recipes locally with Beam (and looking at pangeo_forge_runner.feedstock.Feedstock._import()), and it looks like the issues are being caused by the use of class AscatXarrayZarrRecipe(XarrayZarrRecipe). Subsequent pickling of the recipe objects results in the following being raised by pickle.save_global():

_pickle.PicklingError: Can't pickle <class 'AscatXarrayZarrRecipe'>: it's not found as builtins.AscatXarrayZarrRecipe

I'll look at refactoring the recipes so that the required operations are performed on XarrayZarrRecipe instances instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked:backend-release Requires a new release of the orchestrator backend. blocked:runner-release Requires a new release of pangeo-forge-runner.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants