-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Go back to rechunker Pipeline executors #192
Conversation
def _to_pipelines(self) -> ParallelPipelines: | ||
"""Translate recipe to pipeline for execution. | ||
""" | ||
pipeline = [] # type: MultiStagePipeline | ||
pipeline.append(Stage(self.store_chunk, list(self.iter_chunks()))) | ||
pipeline.append(Stage(self.finalize_target)) | ||
pipelines = [pipeline] # type: ParallelPipelines | ||
return pipelines |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here note that the compilation of the flow to pipelines is now implemented by each recipe class, not BaseRecipe. This gives recipes a lot more flexibility.
If this branch works, it will help things a lot going forward. We need an answer. I would like to find a way to test this at scale with the workflows that were previously running out of memory. Questions for @sharkinsspatial:
|
@rabernat I can have a go at this. I already have a rudimentary version of pangeo-forge/pangeo-forge-prefect#25 which I was using for testing purposes which will handle this. Do you have a specific recipe which would be the best candidate for testing? |
Whichever recipe was causing things to crash before. I guess the noaa-oisst one? |
@rabernat I attempted to run As another test, I attempted to run https://github.com/pangeo-forge/gpm-imerge-hhr-feedstock. Running against For next steps,
Let me know if there might be any other considerations here. |
@sharkinsspatial, if the error is |
Do you have DEBUG logs from the failed run we can log at? Just as a general point, exposing these logs for public view seems like a really important priority for the bakeries. |
@rabernat, at our last Coordination Meeting, there was discussion of routing these logs to a "third party logging service" to get them out from behind the Prefect credential wall, and make them more easily searchable. Does that still seem like a viable option and if so, is there a particular service or implementation from another project, which we could use as a model? |
Based on this post, one free path would be
|
@cisaacstern Your assumption was correct that this is authentication issue. See the logs below
To re-iterate, this authentication issue does not occur with |
0.5.0 is over a month old. We are already on 0.6.0 (see release notes). It seems most likely that this regression does not have to do with this PR (which doesn't touch the related code) but rather is due to #167 (refactor of input authentication and secret handling). Automatic building of bakery images (pangeo-forge/pangeo-forge-bakery-images#7), coupled with bakery integration testing, might have helped catch this issue earlier. How / where are the secrets getting passed to the flow? |
As discussed at our last dev meeting, we may want to try going back to using the rechunker Pipeline executors internally.
This PR tries that, plus moves towards a more flexible execution model. Rather than having to define the four specific stages, Recipes just have to define a private method
_to_pipelines()
. That way they can customize their own stages.This seems to work (tests pass). The big question is whether the Dask and Prefect graphs are still as efficient as we need them to be. My hope is yes, because the underlying functions in the pipelines are still based on the more efficient
partial
functions that @TomAugspurger defined in #160.