Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REQUEST]: HighResMIP sea-ice dataset #180

Open
mvichi opened this issue Oct 26, 2024 · 11 comments
Open

[REQUEST]: HighResMIP sea-ice dataset #180

mvichi opened this issue Oct 26, 2024 · 11 comments
Labels
request Requests for new data to be ingested to the cloud

Comments

@mvichi
Copy link

mvichi commented Oct 26, 2024

List of requested idds

'CMIP6.HighResMIP.AWI.AWI-CM-1-1-HR.hist-1950.r1i1p1f2.SImon.sivol.gn.v20170825',
 'CMIP6.HighResMIP.AWI.AWI-CM-1-1-LR.hist-1950.r1i1p1f2.SImon.sivol.gn.v20170825',
 'CMIP6.HighResMIP.BCC.BCC-CSM2-HR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20200921',
 'CMIP6.HighResMIP.BCC.BCC-CSM2-HR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20200921',
 'CMIP6.HighResMIP.CMCC.CMCC-CM2-HR4.hist-1950.r1i1p1f1.SImon.siconc.gn.v20200917',
 'CMIP6.HighResMIP.CMCC.CMCC-CM2-HR4.hist-1950.r1i1p1f1.SImon.sivol.gn.v20200917',
 'CMIP6.HighResMIP.CMCC.CMCC-CM2-VHR4.hist-1950.r1i1p1f1.SImon.siconc.gn.v20200917',
 'CMIP6.HighResMIP.CMCC.CMCC-CM2-VHR4.hist-1950.r1i1p1f1.SImon.sivol.gn.v20200917',
 'CMIP6.HighResMIP.CNRM-CERFACS.CNRM-CM6-1-HR.hist-1950.r1i1p1f2.SImon.siconc.gn.v20190221',
 'CMIP6.HighResMIP.CNRM-CERFACS.CNRM-CM6-1-HR.hist-1950.r1i1p1f2.SImon.sivol.gn.v20190221',
 'CMIP6.HighResMIP.CNRM-CERFACS.CNRM-CM6-1.hist-1950.r1i1p1f2.SImon.siconc.gn.v20190401',
 'CMIP6.HighResMIP.CNRM-CERFACS.CNRM-CM6-1.hist-1950.r1i1p1f2.SImon.sivol.gn.v20190401',
 'CMIP6.HighResMIP.EC-Earth-Consortium.EC-Earth3P-HR.hist-1950.r1i1p2f1.SImon.siconc.gn.v20181212',
 'CMIP6.HighResMIP.EC-Earth-Consortium.EC-Earth3P-HR.hist-1950.r1i1p2f1.SImon.sivol.gn.v20181212',
 'CMIP6.HighResMIP.EC-Earth-Consortium.EC-Earth3P.hist-1950.r1i1p2f1.SImon.siconc.gn.v20190314',
 'CMIP6.HighResMIP.EC-Earth-Consortium.EC-Earth3P.hist-1950.r1i1p2f1.SImon.sivol.gn.v20190314',
 'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-HR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20170915',
 'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-HR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20170915',
 'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-LR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20180221',
 'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-LR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20180221',
 'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-MR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20181119',
 'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-MR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20181119',
 'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-HM.hist-1950.r1i1p1f1.SImon.siconc.gn.v20180730',
 'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-HM.hist-1950.r1i1p1f1.SImon.sivol.gn.v20180730',
 'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-LL.hist-1950.r1i1p1f1.SImon.siconc.gn.v20170921',
 'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-LL.hist-1950.r1i1p1f1.SImon.sivol.gn.v20170921',
 'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-MM.hist-1950.r1i1p1f1.SImon.siconc.gn.v20170928',
 'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-MM.hist-1950.r1i1p1f1.SImon.sivol.gn.v20170928',
 'CMIP6.HighResMIP.MPI-M.MPI-ESM1-2-HR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20180606',
 'CMIP6.HighResMIP.MPI-M.MPI-ESM1-2-HR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20180606',
 'CMIP6.HighResMIP.MPI-M.MPI-ESM1-2-XR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20180606',
 'CMIP6.HighResMIP.MPI-M.MPI-ESM1-2-XR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20180606',
 'CMIP6.HighResMIP.NCAR.CESM1-CAM5-SE-HR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20200810',
 'CMIP6.HighResMIP.NCAR.CESM1-CAM5-SE-HR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20200810',
 'CMIP6.HighResMIP.NCAR.CESM1-CAM5-SE-LR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20200812',
 'CMIP6.HighResMIP.NCAR.CESM1-CAM5-SE-LR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20200812',
 'CMIP6.HighResMIP.NERC.HadGEM3-GC31-HH.hist-1950.r1i1p1f1.SImon.siconc.gn.v20210416',
 'CMIP6.HighResMIP.NERC.HadGEM3-GC31-HH.hist-1950.r1i1p1f1.SImon.sivol.gn.v20210416',
 'CMIP6.HighResMIP.NOAA-GFDL.GFDL-CM4C192.hist-1950.r1i1p1f1.SImon.siconc.gn.v20180701',
 'CMIP6.HighResMIP.NOAA-GFDL.GFDL-CM4C192.hist-1950.r1i1p1f1.SImon.sivol.gn.v20180701',

Description

Hi guys, thanks a lot for your effort and for continuously improving the system
We recently run an analysis on the HighResMIP output to assess the performance of sea ice simulations in the northern and southern hemisphere. The work was done with the "download model" and it was published in the two papers below (Selivanova et al., 2024a,b).
An MSc student at the University of Cape Town is currently adapting the SItool (Lin et al., 2021) to work with Pangeo, and we would also like to add the assessment of the HighResMIP. The student is currently testing the system with the low-res CMIP6 models, and it would be great to add the HighResMIP. The thesis should be submitted in February 2025, but the analysis should be ideally completed before the end of 2024.
Thanks in advance,
Marcello

Lin, X., Massonnet, F., Fichefet, T., Vancoppenolle, M., 2021. SITool (v1.0) – a new evaluation tool for large-scale sea ice simulations: application to CMIP6 OMIP. Geoscientific Model Development 14, 6331–6354. https://doi.org/10.5194/gmd-14-6331-2021
Selivanova, J., Iovino, D., Cocetta, F., 2024a. Past and future of the Arctic sea ice in High-Resolution Model Intercomparison Project (HighResMIP) climate models. The Cryosphere 18, 2739–2763. https://doi.org/10.5194/tc-18-2739-2024
Selivanova, J., Iovino, D., Vichi, M., 2024b. Limited Benefits of Increased Spatial Resolution for Sea Ice in HighResMIP Simulations. Geophysical Research Letters 51, e2023GL107969. https://doi.org/10.1029/2023GL107969

@mvichi mvichi added the request Requests for new data to be ingested to the cloud label Oct 26, 2024
jbusecke added a commit that referenced this issue Oct 28, 2024
@jbusecke
Copy link
Collaborator

Hi @mvichi thanks for using the cloud data. I just started #181 as a test and will run the full thing as soon as the PR succeeds!
I am very busy this week, but this should squeeze in between other tasks and is related to my work all week. So please feel free to ping me here or via email ([email protected]) in the likely case that I forget to move on this. I am motivated to get as much data up as possible for your deadline.

@jbusecke
Copy link
Collaborator

jbusecke commented Oct 28, 2024

Seems like we are getting only 6 datasets from the ESGF API right now. Is that useful to ingest already? Happy to rerun things a few times and hope for better availability! EDIT: This was my bad. I did not allow all member_id s. Lets see how many we get now!

@jbusecke
Copy link
Collaborator

jbusecke commented Oct 28, 2024

Ok this looks better:

'CMIP6.HighResMIP.EC-Earth-Consortium.EC-Earth3P.hist-1950.r3i1p2f1.SImon.sivol.gn.v20190215',
'CMIP6.HighResMIP.EC-Earth-Consortium.EC-Earth3P.hist-1950.r1i1p2f1.SImon.siconc.gn.v20190314',
'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-MM.hist-1950.r1i3p1f1.SImon.siconc.gn.v20190710',
'CMIP6.HighResMIP.CNRM-CERFACS.CNRM-CM6-1.hist-1950.r1i1p1f2.SImon.siconc.gn.v20190401',
'CMIP6.HighResMIP.EC-Earth-Consortium.EC-Earth3P.hist-1950.r1i1p2f1.SImon.sivol.gn.v20190314',
'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-LR.hist-1950.r6i1p1f1.SImon.siconc.gn.v20181119',
'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-LR.hist-1950.r8i1p1f1.SImon.siconc.gn.v20190425',
'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-LL.hist-1950.r1i5p1f1.SImon.siconc.gn.v20190418',
'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-HR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20170915',
'CMIP6.HighResMIP.CNRM-CERFACS.CNRM-CM6-1-HR.hist-1950.r2i1p1f2.SImon.sivol.gn.v20200615',
'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-HR.hist-1950.r3i1p1f1.SImon.siconc.gn.v20181119',
'CMIP6.HighResMIP.CMCC.CMCC-CM2-HR4.hist-1950.r1i1p1f1.SImon.sivol.gn.v20200917',
'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-HM.hist-1950.r1i3p1f1.SImon.siconc.gn.v20190710',
'CMIP6.HighResMIP.CNRM-CERFACS.CNRM-CM6-1-HR.hist-1950.r3i1p1f2.SImon.sivol.gn.v20200615',
'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-LR.hist-1950.r3i1p1f1.SImon.siconc.gn.v20181119',
'CMIP6.HighResMIP.CNRM-CERFACS.CNRM-CM6-1.hist-1950.r3i1p1f2.SImon.siconc.gn.v20200224',
'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-MM.hist-1950.r1i2p1f1.SImon.siconc.gn.v20190710',
'CMIP6.HighResMIP.EC-Earth-Consortium.EC-Earth3P.hist-1950.r2i1p2f1.SImon.siconc.gn.v20190812',
'CMIP6.HighResMIP.EC-Earth-Consortium.EC-Earth3P.hist-1950.r2i1p2f1.SImon.sivol.gn.v20190812',
'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-MM.hist-1950.r1i2p1f1.SImon.sivol.gn.v20190710'

Seem to be available right now. Not all you requested, but Ill try to run these now and we can rerun later.

jbusecke added a commit that referenced this issue Oct 28, 2024
* Update iids_pr.yaml

Towards #180

* Update iids_pr.yaml

* Update iids_pr.yaml

* Update iids.yaml

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update iids_pr.yaml

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
@jbusecke
Copy link
Collaborator

Seeing a few errors for unavailable files (hopefully these resolve over time), but also a bunch of successful jobs already. Ill check in in a bit and give you a report for now.

@jbusecke
Copy link
Collaborator

jbusecke commented Oct 28, 2024

Ok will need to change gear and work on something else for now, but lets continue here soon.

So I followed the instructions to check which datasets were uploaded here and got:

Found in catalog='qc': iids=['CMIP6.HighResMIP.NERC.HadGEM3-GC31-HH.hist-1950.r1i1p1f1.SImon.sivol.gn.v20210416', 'CMIP6.HighResMIP.CMCC.CMCC-CM2-HR4.hist-1950.r1i1p1f1.SImon.siconc.gn.v20200917', 'CMIP6.HighResMIP.CNRM-CERFACS.CNRM-CM6-1.hist-1950.r1i1p1f2.SImon.sivol.gn.v20190401', 'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-HM.hist-1950.r1i1p1f1.SImon.siconc.gn.v20180730', 'CMIP6.HighResMIP.EC-Earth-Consortium.EC-Earth3P.hist-1950.r1i1p2f1.SImon.siconc.gn.v20190314', 'CMIP6.HighResMIP.EC-Earth-Consortium.EC-Earth3P-HR.hist-1950.r1i1p2f1.SImon.sivol.gn.v20181212', 'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-HM.hist-1950.r1i1p1f1.SImon.sivol.gn.v20180730', 'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-LL.hist-1950.r1i1p1f1.SImon.siconc.gn.v20170921', 'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-MM.hist-1950.r1i1p1f1.SImon.siconc.gn.v20170928', 'CMIP6.HighResMIP.CMCC.CMCC-CM2-VHR4.hist-1950.r1i1p1f1.SImon.siconc.gn.v20200917', 'CMIP6.HighResMIP.EC-Earth-Consortium.EC-Earth3P-HR.hist-1950.r1i1p2f1.SImon.siconc.gn.v20181212', 'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-LL.hist-1950.r1i1p1f1.SImon.sivol.gn.v20170921', 'CMIP6.HighResMIP.CMCC.CMCC-CM2-HR4.hist-1950.r1i1p1f1.SImon.sivol.gn.v20200917', 'CMIP6.HighResMIP.CNRM-CERFACS.CNRM-CM6-1.hist-1950.r1i1p1f2.SImon.siconc.gn.v20190401', 'CMIP6.HighResMIP.CNRM-CERFACS.CNRM-CM6-1-HR.hist-1950.r1i1p1f2.SImon.siconc.gn.v20190221', 'CMIP6.HighResMIP.CNRM-CERFACS.CNRM-CM6-1-HR.hist-1950.r1i1p1f2.SImon.sivol.gn.v20190221', 'CMIP6.HighResMIP.NOAA-GFDL.GFDL-CM4C192.hist-1950.r1i1p1f1.SImon.sivol.gn.v20180701', 'CMIP6.HighResMIP.NOAA-GFDL.GFDL-CM4C192.hist-1950.r1i1p1f1.SImon.siconc.gn.v20180701']

Found in catalog='non-qc': iids=['CMIP6.HighResMIP.ECMWF.ECMWF-IFS-MR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20181119', 'CMIP6.HighResMIP.AWI.AWI-CM-1-1-LR.hist-1950.r1i1p1f2.SImon.sivol.gn.v20170825', 'CMIP6.HighResMIP.MPI-M.MPI-ESM1-2-HR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20180606', 'CMIP6.HighResMIP.AWI.AWI-CM-1-1-HR.hist-1950.r1i1p1f2.SImon.sivol.gn.v20170825', 'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-HR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20170915', 'CMIP6.HighResMIP.MPI-M.MPI-ESM1-2-XR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20180606', 'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-MR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20181119', 'CMIP6.HighResMIP.MPI-M.MPI-ESM1-2-XR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20180606', 'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-LR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20180221', 'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-LR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20180221', 'CMIP6.HighResMIP.MPI-M.MPI-ESM1-2-HR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20180606']

Found in catalog='retracted': iids=[]



Still missing 11 of 40: 
missing_iids=['CMIP6.HighResMIP.NCAR.CESM1-CAM5-SE-HR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20200810', 'CMIP6.HighResMIP.NCAR.CESM1-CAM5-SE-HR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20200810', 'CMIP6.HighResMIP.EC-Earth-Consortium.EC-Earth3P.hist-1950.r1i1p2f1.SImon.sivol.gn.v20190314', 'CMIP6.HighResMIP.BCC.BCC-CSM2-HR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20200921', 'CMIP6.HighResMIP.NERC.HadGEM3-GC31-HH.hist-1950.r1i1p1f1.SImon.siconc.gn.v20210416', 'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-MM.hist-1950.r1i1p1f1.SImon.sivol.gn.v20170928', 'CMIP6.HighResMIP.NCAR.CESM1-CAM5-SE-LR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20200812', 'CMIP6.HighResMIP.BCC.BCC-CSM2-HR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20200921', 'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-HR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20170915', 'CMIP6.HighResMIP.NCAR.CESM1-CAM5-SE-LR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20200812', 'CMIP6.HighResMIP.CMCC.CMCC-CM2-VHR4.hist-1950.r1i1p1f1.SImon.sivol.gn.v20200917']

Seems like we got 10+ uploaded and tested! There are quite a few that fail our tests (the non-qc) catalog. If you or the student have some time to look into what might be wrong with those datasets (follow the instructions here to access the non-qc datasets) that would be very helpful. Perhaps we can fix the issues. For the 11 ones that are still missing, I would recommend that we rerun the ingestion a couple of times and see if this is just due to flaky data nodes.

@mvichi
Copy link
Author

mvichi commented Oct 30, 2024

Thank you, Julius, that was incredibly quick. I am travelling right now, and I'll be back to work next week. We'll report back on their status and quality asap.
We appreciate very much your prompt reaction!

@jbusecke
Copy link
Collaborator

Running the pipeline once again just to see if we catch some more. Getting closer:

Still missing 4 of 40: 
missing_iids=['CMIP6.HighResMIP.NCAR.CESM1-CAM5-SE-HR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20200810', 'CMIP6.HighResMIP.BCC.BCC-CSM2-HR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20200921', 'CMIP6.HighResMIP.NCAR.CESM1-CAM5-SE-LR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20200812', 'CMIP6.HighResMIP.BCC.BCC-CSM2-HR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20200921'

@jbusecke
Copy link
Collaborator

jbusecke commented Nov 14, 2024

Running again to see if we can get the last hold outs to ingest. @mvichi did you get a chance to test the newly ingested data?

@mvichi
Copy link
Author

mvichi commented Nov 18, 2024

We tested all the available data, and most of them work, thanks!
Some of them fail the xmip preprocessing and some other crashes for other reasons, but the data integrity seems good. Thank you very much again for adding the data so quickly.
We will make it available as a cookbook once completed. I'll share it through discourse, so that you can decide

@jbusecke
Copy link
Collaborator

Awesome. If you could raise issues over at xMIP I can take a look at what is going on once some time frees up!

@jbusecke
Copy link
Collaborator

I am also still seeing 2 missing datasets on my end:

import intake

def zstore_to_iid(zstore: str):
    # this is a bit whacky to account for the different way of storing old/new stores
    iid =  '.'.join(zstore.replace('gs://','').replace('.zarr','').replace('.','/').split('/')[-11:-1])
    if not iid.startswith('CMIP6'):
        iid =  '.'.join(zstore.replace('gs://','').replace('.zarr','').replace('.','/').split('/')[-10:])
    return iid

def search_iids(col_url:str):
    col = intake.open_esm_datastore(col_url)
    iids_all= [zstore_to_iid(z) for z in col.df['zstore'].tolist()]
    return [iid for iid in iids_all if iid in iids_requested]


iids_requested = [
'CMIP6.HighResMIP.AWI.AWI-CM-1-1-HR.hist-1950.r1i1p1f2.SImon.sivol.gn.v20170825',
 'CMIP6.HighResMIP.AWI.AWI-CM-1-1-LR.hist-1950.r1i1p1f2.SImon.sivol.gn.v20170825',
 'CMIP6.HighResMIP.BCC.BCC-CSM2-HR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20200921',
 'CMIP6.HighResMIP.BCC.BCC-CSM2-HR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20200921',
 'CMIP6.HighResMIP.CMCC.CMCC-CM2-HR4.hist-1950.r1i1p1f1.SImon.siconc.gn.v20200917',
 'CMIP6.HighResMIP.CMCC.CMCC-CM2-HR4.hist-1950.r1i1p1f1.SImon.sivol.gn.v20200917',
 'CMIP6.HighResMIP.CMCC.CMCC-CM2-VHR4.hist-1950.r1i1p1f1.SImon.siconc.gn.v20200917',
 'CMIP6.HighResMIP.CMCC.CMCC-CM2-VHR4.hist-1950.r1i1p1f1.SImon.sivol.gn.v20200917',
 'CMIP6.HighResMIP.CNRM-CERFACS.CNRM-CM6-1-HR.hist-1950.r1i1p1f2.SImon.siconc.gn.v20190221',
 'CMIP6.HighResMIP.CNRM-CERFACS.CNRM-CM6-1-HR.hist-1950.r1i1p1f2.SImon.sivol.gn.v20190221',
 'CMIP6.HighResMIP.CNRM-CERFACS.CNRM-CM6-1.hist-1950.r1i1p1f2.SImon.siconc.gn.v20190401',
 'CMIP6.HighResMIP.CNRM-CERFACS.CNRM-CM6-1.hist-1950.r1i1p1f2.SImon.sivol.gn.v20190401',
 'CMIP6.HighResMIP.EC-Earth-Consortium.EC-Earth3P-HR.hist-1950.r1i1p2f1.SImon.siconc.gn.v20181212',
 'CMIP6.HighResMIP.EC-Earth-Consortium.EC-Earth3P-HR.hist-1950.r1i1p2f1.SImon.sivol.gn.v20181212',
 'CMIP6.HighResMIP.EC-Earth-Consortium.EC-Earth3P.hist-1950.r1i1p2f1.SImon.siconc.gn.v20190314',
 'CMIP6.HighResMIP.EC-Earth-Consortium.EC-Earth3P.hist-1950.r1i1p2f1.SImon.sivol.gn.v20190314',
 'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-HR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20170915',
 'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-HR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20170915',
 'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-LR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20180221',
 'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-LR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20180221',
 'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-MR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20181119',
 'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-MR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20181119',
 'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-HM.hist-1950.r1i1p1f1.SImon.siconc.gn.v20180730',
 'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-HM.hist-1950.r1i1p1f1.SImon.sivol.gn.v20180730',
 'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-LL.hist-1950.r1i1p1f1.SImon.siconc.gn.v20170921',
 'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-LL.hist-1950.r1i1p1f1.SImon.sivol.gn.v20170921',
 'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-MM.hist-1950.r1i1p1f1.SImon.siconc.gn.v20170928',
 'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-MM.hist-1950.r1i1p1f1.SImon.sivol.gn.v20170928',
 'CMIP6.HighResMIP.MPI-M.MPI-ESM1-2-HR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20180606',
 'CMIP6.HighResMIP.MPI-M.MPI-ESM1-2-HR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20180606',
 'CMIP6.HighResMIP.MPI-M.MPI-ESM1-2-XR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20180606',
 'CMIP6.HighResMIP.MPI-M.MPI-ESM1-2-XR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20180606',
 'CMIP6.HighResMIP.NCAR.CESM1-CAM5-SE-HR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20200810',
 'CMIP6.HighResMIP.NCAR.CESM1-CAM5-SE-HR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20200810',
 'CMIP6.HighResMIP.NCAR.CESM1-CAM5-SE-LR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20200812',
 'CMIP6.HighResMIP.NCAR.CESM1-CAM5-SE-LR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20200812',
 'CMIP6.HighResMIP.NERC.HadGEM3-GC31-HH.hist-1950.r1i1p1f1.SImon.siconc.gn.v20210416',
 'CMIP6.HighResMIP.NERC.HadGEM3-GC31-HH.hist-1950.r1i1p1f1.SImon.sivol.gn.v20210416',
 'CMIP6.HighResMIP.NOAA-GFDL.GFDL-CM4C192.hist-1950.r1i1p1f1.SImon.siconc.gn.v20180701',
 'CMIP6.HighResMIP.NOAA-GFDL.GFDL-CM4C192.hist-1950.r1i1p1f1.SImon.sivol.gn.v20180701',
]

url_dict = {
    'qc':"https://storage.googleapis.com/cmip6/cmip6-pgf-ingestion-test/catalog/catalog.json",
    'non-qc':"https://storage.googleapis.com/cmip6/cmip6-pgf-ingestion-test/catalog/catalog_noqc.json",
    'retracted':"https://storage.googleapis.com/cmip6/cmip6-pgf-ingestion-test/catalog/catalog_retracted.json"
}

iids_found = []
for catalog,url in url_dict.items():
    iids = search_iids(url)
    iids_found.extend(iids)
    print(f"Found in {catalog=}: {iids=}\n")

missing_iids = list(set(iids_requested) - set(iids_found))
print(f"\n\nStill missing {len(missing_iids)} of {len(iids_requested)}: \n{missing_iids=}")

So ill leave this open for now, unless you think we can close this.

In any case, please make sure to cite the original CMIP6 data sources and if you could acknowledge our efforts here (https://zenodo.org/badge/latestdoi/618127503) too that would help a lot. Cheers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
request Requests for new data to be ingested to the cloud
Projects
None yet
Development

No branches or pull requests

2 participants