Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core][compiled-graph] Should we support leaf nodes? #47977

Closed
kevin85421 opened this issue Oct 10, 2024 · 3 comments
Closed

[core][compiled-graph] Should we support leaf nodes? #47977

kevin85421 opened this issue Oct 10, 2024 · 3 comments
Labels
compiled-graphs enhancement Request for new feature and/or capability question Just a question :)

Comments

@kevin85421
Copy link
Member

Description

#47757 raises an exception if it detects any leaf nodes. However, asking users to add all leaf nodes to MultiOutputNode is not the best UX. Some possible solutions:

  1. Launch another thread in the driver process to watch exceptions from leaf nodes.
  2. Provide another way for leaf nodes to notify the driver process when they throw exceptions. For example, leaf nodes can talk to a "supervisor" actor on the head node. (input from @rynewang)

@ruisearch42's comment: #47757 (review)

Use case

No response

@kevin85421 kevin85421 added enhancement Request for new feature and/or capability triage Needs triage (eg: priority, bug/not-bug, and owning component) compiled-graphs and removed triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Oct 10, 2024
@rynewang
Copy link
Contributor

I think it's good to let compiled graphs return a Tuple[Output, Exceptions from leaf nodes]. When compiling a graph this means an edge to pass exceptions from each leaf node to the output (driver?).

An actor to receive messages is a good way on the user side to do, e.g. even when there's no exceptions user can send control messages and events to an actor like a message queue. But Ray can't do that just for exception passing.

One side note is on ordinary Ray, if you make a remote call and don't ray.get it, the exception can go away silently (when the obj ref goes out of scope). Maybe we can come up with something to help that case as well.

@stephanie-wang
Copy link
Contributor

In ordinary Ray, you are supposed to get a log message on the driver if an ObjectRef that contained an exception goes out of scope and no one called ray.get on it. See message here.

So regarding @ruisearch42's comment, there is a difference in with the new API, because now when you drop the error's reference, you should get a log message about it.

If the logging isn't working for ordinary Ray or compiled Ray, it is a bug in either case and needs to be fixed.

@stephanie-wang stephanie-wang added the question Just a question :) label Oct 10, 2024
@stephanie-wang
Copy link
Contributor

Seems like @rkooo567 is already looking into this for compiled Ray in #47689.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiled-graphs enhancement Request for new feature and/or capability question Just a question :)
Projects
None yet
Development

No branches or pull requests

4 participants