-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add STAC creation and Xarray loading functionality #266
base: develop
Are you sure you want to change the base?
Conversation
Thanks for this @forrestfwilliams! We were able to run the demo notebook without any trouble and the new functions greatly facilitate the hyp3->mintpy connection. Also wanted to link to this issue with some of the original discussion and links to prototypes ASFHyP3/hyp3-isce2#170. |
src/hyp3_sdk/stac.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest aligning with this extension as much as possible - https://github.com/stac-extensions/insar. Some of these extensions are not sar-specifc (sat:orbit_state for example), but are used out in the wild by many commercial data providers (planet, maxar, capella, umbra, etc), and the more standardization around common names the better from a user perspective! Some specific recommendations below:
- sar:looks_range, sar:looks_azimuth, sar:observation_direction,
- sat:orbit_state, sat:relative_orbit, (e.g. instead of reference_orbit_direction, secondary_orbit_number)
- view:azimuth, view:incidence_angle
- processing:lineage, processing:software
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback @scottyhq, I'll look at incorporating these fields!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put up a test catalog here based on burst products since rendering with the canonical stacbrowser provides a nice test (https://radiantearth.github.io/stac-browser/#/external/raw.githubusercontent.com/relativeorbit/three-sisters/main/115_245676_IW2/stac/collection.json?.language=en) . Metadata is looking good! Would be great to be able to render the tiffs on the map, currently looking at why they dont...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow that looks even better than I would have expected! I've added all the metadata fields you mentioned that I can for now. For some of the fields you mentioned we don't provide the needed metadata in our HyP3 products (i.e. software version info for processing:software
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey forrest, just revisiting this :) I think there are options for processing:software
that would be great to track in this metadata. So HyP3 ISCE2 v1.0.0
or HyP3 GAMMA v8.1.2
? I could also see usefulness in tracking the underlying software versionisce2-2.6.3
, or I suppose pointing to the Docker Image that HYP3 runs ghcr.io/asfhyp3/hyp3-isce2:1.0.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@forrestfwilliams @scottyhq we do capture the processing software information in all our products, but not in the most machine-readable way. Typically, with a sentence like (jinja2 templated):
This data was processed by ASF DAAC HyP3 {{ processing_date.year }} using the {{ plugin_name }} plugin version
{{ plugin_version }} running {{ processor_name }} release {{ processor_version }}.
The plugin name + plugin version directly corresponds to the plugin container used to create the product, so we could easily add that as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think adding software version is a good idea, but we should add this info to our base metadata txt file before we add it to the STAC items. The STAC implementation becomes much simpler once we have this in place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately it looks like a month ago I added some comments that maybe never appeared since they are "pending" until I hit the submit review button 🤦. I'm hitting 'approve' since I'm in favor of adding this and tested it out for hyp3-isce2 bursts, but obviously I'm guessing it'll need another ASF reviewer!
'sar:looks_range': extra_properties['hyp3:range_looks'], | ||
'sat:orbit_state': extra_properties['hyp3:reference_orbit_direction'].lower(), | ||
'sat:absolute_orbit': extra_properties['hyp3:reference_orbit_number'], | ||
'view:azimuth': (360 + extra_properties['hyp3:heading']) % 360, # change of convention |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know there are lots of 'heading/azimuth' and incidence conventions out there for SAR LOS conversions. But i suggest also adding view:incidence_angle
here (https://github.com/stac-extensions/view#item-properties)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incidence angle information is also something we don't currently report in our txt metadata files. We'll need to add this field on the plugin side before we can implement view:incidence_angle
.
src/hyp3_sdk/stac.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey forrest, just revisiting this :) I think there are options for processing:software
that would be great to track in this metadata. So HyP3 ISCE2 v1.0.0
or HyP3 GAMMA v8.1.2
? I could also see usefulness in tracking the underlying software versionisce2-2.6.3
, or I suppose pointing to the Docker Image that HYP3 runs ghcr.io/asfhyp3/hyp3-isce2:1.0.0
@scottyhq I'm on the hook to review this! I'll try and get to it next week. My high-level comments right now that @forrestfwilliams and I have been kicking around are that the actual stac item json should be created by the plugin, not here in the SDK as the pluins have all the information, context, and importantly dependencies to create the item. That would allow, for example, us to add STAC endpoints to hyp3 which I've prototyped here (not following the STAC spec yet though): @forrestfwilliams would prefer to "just get this out" (him) instead of "doing it right" (me) so we can help users now, as getting the work to add it to the plugins prioritized and scheduled in the team backlog is likely to be a slow process and not under either of our control. @scottyhq, what do you think? I'm def. interested to hear what approach you'd prefer. I plan on doing an in-depth review this early next week and I'd expect @forrestfwilliams and I will settle on a path forward then. |
properties.update(extra_properties) | ||
item = pystac.Item( | ||
id=base_url.split('/')[-1].replace('.zip', ''), | ||
geometry=geo_info.bbox_geojson, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preferably geometry
is the valid data footprint rather than the bbox. could bring in this dependency for convenience https://stactools.readthedocs.io/en/stable/footprint.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point. The current geojson demarcates the actual extent of raster products, including the nodata areas. Unfortunately, obtaining the rotated footprint that contains valid data will likely take work on the plugin and will need to wait for future iterations.
Agreed it makes sense to for the plugins to generate an The discussion and iteration on this branch to hone the metadata is definitely useful in the meantime! And I'm glad that people can install from this branch and take the SDK approach if sufficiently motivated :) |
Co-authored-by: Scott Henderson <[email protected]>
OK @jhkennedy this is ready for your review. Notably, I've punted on adding some of @scottyhq's requested features:
All of these will be simpler with upstream changes to the plugins/moving the STAC item creation to the plugins. As a first step though, I think it still makes sense to add the STAC functionality to the SDK, then migrate it to the plugins in the future. This allows us to get this functionality to our users quicker, and makes requires us to work in fewer repositories while we're still nailing down the basics. |
Adapted from and heavily inspired by @scottyhq's 2023 AGU presentation, this PR adds the ability to create STAC items and collections from sets of completed HyP3 jobs. While we do provide unzipped copies of HyP3 products, we have never publicized this well because there has not been an efficient method to retrieve them. The STAC ecosystem provides an elegant solution to this problem that the community is already familiar with.
In addition, it provides utilities for turning these STAC collection into Xarray datastacks using
odc-stac
, and for turning these Xarray objects into MintPy-compatible hdf5 files. Using this new functionality to prep HyP3 data for MintPy is a significant improvement over our current preparation guidelines, since it removes the need to download, unzip, and crop each product individually.To demo the new MintPy workflow enabled by these changes, check out this modified version of our HyP3 MintPy notebook.