-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How best to support Ensembles of data #172
Comments
CoverageJSON can support an arbitrary number of axes, including ensemble. However, the defined domain types do specify which axes they support. So if he’s trying to use a Grid domain type (which is defined to have x and y and maybe z and t) with a custom axis, then yes, the validation will fail. Options within the current spec are:
The above basically mirrors what usually happens in the CF-NetCDF world (i.e. there is no established convention for an ensemble axis and most software won’t recognise one). The other aspect is that ensemble datasets are likely to get very large and JSON documents aren’t designed for large bulk transfers. Datasets could be split into tiles, but this is more complicated. Could it be easier just to have each ensemble member as a separate coverage? |
@jonblower We could discuss this online tomorrow (I and @m-burgoyne will not be in the Office). But we are keen to future proof for a variety of use cases, and therefore support multiple dimensions, and also envisage pervasive ensemble usage, though only "small" subsets via OGC API-EDR and its various Parts (Core, Pub-Sub, aggregated stats, etc). We may need to break with the traditional CF-NetCDF model at some stage to benefit from (geo)Zarr, etc.
The above is just me thinking aloud. |
Just to note that the purpose of the "domain types" is to define a practical set of restricted profiles of CoverageJSON so that clients don't have to deal with a large multiplicity of approaches. And yes, the point of using tiles is that the individual tiles are persistent and cacheable. My philosophical point is that I have always been wary of treating "ensemble" as a dimension, with the same status as a spatiotemporal dimension. I know it's tempting to do so in the context of hypercubes, but ensembles (unlike S-T dimensions) don't have any defined ordering. I have always preferred to model them essentially as separate coverages, although this does introduce some redundancy of metadata. Another option, which I've only just thought of, is to somehow deal with this in the Range. Maybe this isn't very neat, but if the ensemble members all share the same domain, then we could have (n x m) Range documents, where n is the number of ensemble members and m is the number of variables in each member. Would need more thought (and coffee!) |
@jonblower An advantage of your third apporach, though it may be prine to be voluminous and creeping out of the orginal scope of CoverageJSON, is that it is more akin to the underlying theoretical scientific approach. I.e. an ensemble is an approximation to a probability distribution function (pdf) of a variable. It may be more amenable or elegant for retrieving derived statistics such a percentile or a threshold value. |
I think any approach will be voluminous, but the appealing things about dealing with this in the Range are: (1) we only need to define the Domain once, (2) the high volume can easily be managed by splitting the Range among separate documents, so the client only needs to get the members they want. If I get some time I'll see if I can work up a proposal - there could be a fatal flaw I haven't spotted! |
@chris-little Since I don't think this is a schema problem, but a question of how CovJSON can best support ensembles, would it be worth renaming this ticket, or closing it and opening another one? Did you and @m-burgoyne discuss a preferred approach? |
@jonblower As suggested, I have re-titled this issue, and created a separate one ( Issue #184 ) for more dimensions than 4 . |
@m-burgoyne tried to create CoverageJSON of some NWP ensemble forecast data using a fifth custom dimension. Failed schema check. @jonblower @letmaik Could the schema be enhanced to support more dimensions, and remain backward compatible?
The text was updated successfully, but these errors were encountered: