-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add generic aggregate process #526
base: draft
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,135 @@ | ||||||
{ | ||||||
"id": "aggregate", | ||||||
"summary": "Aggregation based on general intervals", | ||||||
"description": "Computes a aggregation based on an array of intervals.\n\nThe computed values will be projected to the labels. If no labels are specified, the lower value of the interval will be used as label for the corresponding values. In case of a conflict (i.e. the user-specified values for the lower values of the intervals are not distinct), the user-defined labels must be specified in the parameter `labels` as otherwise a `DistinctDimensionLabelsRequired` exception would be thrown. The number of user-defined labels and the number of intervals need to be equal.", | ||||||
"categories": [ | ||||||
"cubes", | ||||||
"aggregate" | ||||||
], | ||||||
"experimental": true, | ||||||
"parameters": [ | ||||||
{ | ||||||
"name": "data", | ||||||
"description": "A data cube.", | ||||||
"schema": { | ||||||
"type": "object", | ||||||
"subtype": "datacube" | ||||||
} | ||||||
}, | ||||||
{ | ||||||
"name": "intervals", | ||||||
"description": "Left-closed intervals, which are allowed to overlap. Each interval in the array has exactly two elements:\n\n1. The first element is the lower value of the interval. The specified value is **included** in the interval.\n2. The second element is the upper value of the temporal interval. The specified value is **excluded** from the interval.\n\nThe second element must always be greater than the first element. Otherwise, an `ExtentEmpty` exception is thrown.", | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. when working with a spatial dimension (e.g. "x" or "y"): how is a users supposed to know what CRS to use to define the intervals? I don't think this is explicitly available or defined on a datacube. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I guess that "temporal" is not intentional there There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Isn't it? The datacube extension defines the CRS and then that defines the unit and extents etc.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes, but nothing that is relevant to figuring out the actual labels ( Also, this might be backend-dependend, so that would undermine the reproducibility of the process graph |
||||||
"schema": { | ||||||
"type": "array", | ||||||
"minItems": 1, | ||||||
"items": { | ||||||
"type": "array", | ||||||
"uniqueItems": true, | ||||||
"minItems": 2, | ||||||
"maxItems": 2, | ||||||
"items": { | ||||||
"type": "number" | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So intervals are only based on numbers? That implies that this process only works along a spatial dimension in practice? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The order of string values are not really well-defined. As such, yes only numerical. Dimensions can have numerical values though, we have a couple of processes that just index the labels if no further information is given (e.g. apply_dimension). |
||||||
} | ||||||
} | ||||||
} | ||||||
}, | ||||||
{ | ||||||
"name": "reducer", | ||||||
"description": "A reducer to be applied for the values contained in each interval. A reducer is a single process such as ``mean()`` or a set of processes, which computes a single value for a list of values, see the category 'reducer' for such processes. Intervals may not contain any values, which for most reducers leads to no-data (`null`) values by default.", | ||||||
"schema": { | ||||||
"type": "object", | ||||||
"subtype": "process-graph", | ||||||
"parameters": [ | ||||||
{ | ||||||
"name": "data", | ||||||
"description": "A labeled array with elements of any type. If there's no data for the interval, the array is empty.", | ||||||
"schema": { | ||||||
"type": "array", | ||||||
"subtype": "labeled-array", | ||||||
"items": { | ||||||
"description": "Any data type." | ||||||
} | ||||||
} | ||||||
}, | ||||||
{ | ||||||
"name": "context", | ||||||
"description": "Additional data passed by the user.", | ||||||
"schema": { | ||||||
"description": "Any data type." | ||||||
}, | ||||||
"optional": true, | ||||||
"default": null | ||||||
} | ||||||
], | ||||||
"returns": { | ||||||
"description": "The value to be set in the new data cube.", | ||||||
"schema": { | ||||||
"description": "Any data type." | ||||||
} | ||||||
} | ||||||
} | ||||||
}, | ||||||
{ | ||||||
"name": "dimension", | ||||||
"description": "The name of the dimension for aggregation. All data along the dimension is passed through the specified reducer. Fails with a `DimensionNotAvailable` exception if the specified dimension does not exist.", | ||||||
"schema": { | ||||||
"type": [ | ||||||
"string", | ||||||
"null" | ||||||
] | ||||||
} | ||||||
}, | ||||||
{ | ||||||
"name": "labels", | ||||||
"description": "Distinct labels for the intervals. Is only required to be specified if the values for the lower values of the intervals are not distinct and thus the default labels would not be unique. The number of labels and the number of groups must be equal.", | ||||||
"schema": { | ||||||
"type": "array", | ||||||
"uniqueItems": true, | ||||||
"items": { | ||||||
"type": "number" | ||||||
} | ||||||
}, | ||||||
"default": [], | ||||||
"optional": true | ||||||
}, | ||||||
{ | ||||||
"name": "context", | ||||||
"description": "Additional data to be passed to the reducer.", | ||||||
"schema": { | ||||||
"description": "Any data type." | ||||||
}, | ||||||
"optional": true, | ||||||
"default": null | ||||||
} | ||||||
], | ||||||
"returns": { | ||||||
"description": "A new data cube with the same dimensions. The dimension properties (name, type, labels, reference system and resolution) remain unchanged, except for the resolution and dimension labels of the given temporal dimension.", | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
is it intentional to have "temporal" there, or should this be generic for all types of dimensions? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
No, just copy-pasted things poorly from aggregate_temporal ;-) |
||||||
"schema": { | ||||||
"type": "object", | ||||||
"subtype": "datacube", | ||||||
"dimensions": [ | ||||||
{ | ||||||
"type": "temporal" | ||||||
} | ||||||
] | ||||||
} | ||||||
}, | ||||||
"exceptions": { | ||||||
"DimensionNotAvailable": { | ||||||
"message": "A dimension with the specified name does not exist." | ||||||
}, | ||||||
"DistinctDimensionLabelsRequired": { | ||||||
"message": "The dimension labels have duplicate values. Distinct labels must be specified." | ||||||
}, | ||||||
"ExtentEmpty": { | ||||||
"message": "At least one of the intervals is empty. The second instant in time must always be greater than the first instant." | ||||||
} | ||||||
}, | ||||||
"links": [ | ||||||
{ | ||||||
"href": "https://openeo.org/documentation/1.0/datacubes.html#aggregate", | ||||||
"rel": "about", | ||||||
"title": "Aggregation explained in the openEO documentation" | ||||||
} | ||||||
] | ||||||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does "projecting" to a label mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"mapped" is probably better than "projected". Maybe it can also just be removed.
(I should also check whether this wording also exists in aggregate_temporal...)