-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposed Recipes for the Last Millennium Reanalysis, v2.x #142
Comments
@cisaacstern Ok! I think I've got another one in the works! One question that came up however is whether it would be best to move these data from their current residence on the NOAA FTP server to THREDDS, and whether that will introduce any new subtleties I should be aware of. When I ran this locally (with the FTP urls), it took about two hours.
|
@cisaacstern True to form, I think I might have a way to tackle the un-gridded variables, but that will have to wait for tomorrow :) |
Up to you! FWIW, I don't think 2 hrs to cache data is necessarily that long. My intuition is that waiting 2 hrs to cache the data (which only has to happen once) is a smaller price to pay than moving things around on the NOAA side, but I don't know how easy it may (or may not) be to move to THREDDS. |
Source Dataset
The Last Millennium Reanalysis (LMR) utilizes an ensemble methodology to assimilate paleoclimate data for the production of annually resolved climate field reconstructions of the Common Era. The data are available at NOAA but not (as far as we know) enabled for OpenDAP access, much less cloud access. The PaleoCube project would like to make them available to paleoclimatologists to support several workflows in the Cloud.
Gridded fields (sea-level pressure, surface air temperature, sst, precipitation, Palmer Drought Severity Index) have the format: (time, MCrun, lat, lon) where time is the year, lat is the latitude index, lon is the longitude index, and MCrun indicate the Monte Carlo iteration index. There are in fact 20 LMR reconstructions contained in these arrays. They differ in the climate model ensemble prior to assimilation (random draws from the CCSM4 Last Millennium simulation) and the proxies that were drawn randomly for the reconstruction (75% of all available proxies). All fields are anomalies from the 1951--1980 time-mean.
File and variable naming conventions follow as closely as possible those for the NOAA 20th Century Reanalysis.
In addition, there are files with full (5000-member) ensembles for global mean surface temperature, northern and southern hemisphere temperature, and various climate indices (e.g. AMO, PDO, AO, NAO, NINO3, SOI).
Data from two versions (2.0 and 2.1) are provided, both described in Tardif et al. (2019). Common aspects are:
Differences are related to the set of assimilated proxies:
Link to the website / online documentation for the data: https://www.ncei.noaa.gov/access/paleo-search/study/27850
The file format is netCDF
The source files are organized as follows: 4 files per gridded field: v2.0 and v2.1, each with mean and spread across the ensemble. for indices, 8 files (4 files per LMR "flavor": GMST, NHMT, SHMT, and posterior indices)
How are the source files accessed : access protocol unknown, but netCDF files are available here: v2.0 files , v2.1 files
Data are public, fully open.
Transformation / Alignment / Merging
No transformation beyond loading into zarr. The .nc files can easily be loaded by
xarray
, so this step should not pose particular problems.Output Dataset
zarr format, preferably parked in GCP US-central so it is easily accessible by 2i2c's linkedearth research hub
The text was updated successfully, but these errors were encountered: