Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vine: Improve how errors are propagated with FuturesExecutor #3922

Merged
merged 6 commits into from
Sep 16, 2024

Conversation

gpauloski
Copy link
Contributor

Proposed Changes

Improves how task errors and timeouts are propagated in Future, wait(), and as_completed() when using the FuturesExecutor (see #3920). The commits are in chronological order, each addressing a single class/function, if you prefer to review it that way.

A couple notes:

  • In addition to fixing how TimeoutErrors are raised in as_completed, I changed it to return a generator which will reduce latency on the consumer side compared to waiting to construct an iterator from a fully constructed list of futures.
  • FutureFunctionCall.output()/FuturePythonTask.output() were inconsistent in when they raised vs returned task exceptions. I changed those methods to always return task exceptions, and then left the responsibility of raising exceptions to the Future.result().
  • FutureFunctionCall.output() seems to only have access to the string representation of the exception (not the pickled exception like in FuturePythonTask). I construct a FunctionCallNoResult to return containing the string representation of the task's exception. This has the downside of not letting the user catch specific types of task exceptions (e.g., exceptions they might expect to occur), but they will at least see the true exception message and not just an empty FunctionCallNoResult.

Warning: This PR only fixes four of the five items in #3920. I didn't immediately see how to fix Future.done(). It calls Manager.task_state() which then calls cvine.vine_task_state but this function was removed in c1f1cd7. Could Future.done() be implemented by calling Future.result(timeout=0)? It's probably not very efficient to do it that way---maybe there's an alternative to the remove cvine.vine_task_state?

Please feel free to push to my branch if you want to alter anything yourself.

Merge Checklist

The following items must be completed before PRs can be merge.
Check these off to verify you have completed all steps.

  • make test Run local tests prior to pushing.
  • make format Format source code to comply with lint policies. Note that some lint errors can only be resolved manually (e.g., Python)
  • make lint Run lint on source code prior to pushing.
  • Manual Update Update the manual to reflect user-visible changes.
  • Type Labels Select a github label for the type: bugfix, enhancement, etc.
  • Product Labels Select a github label for the product: TaskVine, Makeflow, etc.
  • PR RTM Mark your PR as ready to merge.

In Python 3.9 and older, concurrent.futures used its own TimeoutError
class rather than the builtin. TimeoutError is aliased to the builtin in
layer Python versions, so for backwards compatibility we import it in
the test file.

See python/cpython#30197
@dthain
Copy link
Member

dthain commented Aug 29, 2024

@BarrySlyDelgado what do you think?

Copy link
Contributor

@BarrySlyDelgado BarrySlyDelgado left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes look good to me. Thanks @gpauloski!

@gpauloski
Copy link
Contributor Author

Thanks! What should be done about the missing cvine.vine_task_state? I'm happy to push a fix if there is one, but maybe it should become a separate issue?

@BarrySlyDelgado
Copy link
Contributor

#3929 should be fix future.done() and future.running() checking if a task is cancelled still needs to be fixed. Will create an issue.

@gpauloski
Copy link
Contributor Author

Great! FWIW none of my applications use task cancelling so that won't block me (many other executors have limited support for task cancelling).

@BarrySlyDelgado
Copy link
Contributor

@dthain This should be ready to be merged. Then we can merge #3929.

@dthain dthain merged commit d1dba75 into cooperative-computing-lab:master Sep 16, 2024
8 checks passed
colinthomas-z80 pushed a commit to colinthomas-z80/cctools that referenced this pull request Oct 15, 2024
…rative-computing-lab#3922)

* raise task exceptions in python task futures

* wrap FutureFunctionCall tracebacks in FunctionCallNoResult

* Raise TimeoutError in Future when task result is pending

* yield futures and raise TimeoutError in as_completed

* Improve error handling in wait

* Import TimeoutError for Python <3.10 compatibility

In Python 3.9 and older, concurrent.futures used its own TimeoutError
class rather than the builtin. TimeoutError is aliased to the builtin in
layer Python versions, so for backwards compatibility we import it in
the test file.

See python/cpython#30197
btovar pushed a commit that referenced this pull request Oct 21, 2024
* raise task exceptions in python task futures

* wrap FutureFunctionCall tracebacks in FunctionCallNoResult

* Raise TimeoutError in Future when task result is pending

* yield futures and raise TimeoutError in as_completed

* Improve error handling in wait

* Import TimeoutError for Python <3.10 compatibility

In Python 3.9 and older, concurrent.futures used its own TimeoutError
class rather than the builtin. TimeoutError is aliased to the builtin in
layer Python versions, so for backwards compatibility we import it in
the test file.

See python/cpython#30197
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants