-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[core][experimental] Raise an exception if a leaf node is found during compilation #47757
Conversation
Can you create an issue for this? Also can you share me the error message? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit comment for improving error message further.
python/ray/dag/compiled_dag_node.py
Outdated
"Compiled DAG doesn't support leaf nodes that don't have " | ||
"downstream nodes and are not output nodes. There are " | ||
f"{len(leaf_nodes)} leaf nodes in the DAG. Please add them to " | ||
f"the MultiOutputNode. These nodes are: {leaf_nodes}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you improve the error message to show how to solve this error step-by-step? For example, assuming a leaf node is w.f.bind() it could say sth like add the output of w.f.bind() to MultiOutputNode
What I recommend you is to try raising the error on your own and fix it looking at this error message. I think it is not very trivial if you assume you are not a developer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated cc9c823
@rkooo567 This was the original implementation of this PR, but we decided to raise an exception for leaf nodes after our discussions. Do we still need to open an issue to track it? What error message are you referring to? |
cc9c823
to
2e67aa4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking a bit more on the problem, does it make a real difference if we disallow leaf nodes and force the user to add leaf nodes manually to MultiOutputNode
?
Using the same example as mentioned, a MultiOutputNode is containing 3 DAG nodes (2 normal DAG nodes + 1 leaf node).
Previously we have:
x, y = compiled_dag.execute(input_vals)
and the 3rd return value is silently GC-ed and it calls get()
underneath.
When the user changes to
x, y, z = compiled_dag.execute(input_vals)
z
is not used by the user and also discarded, the same GC and get
happens. Is there a difference here?
Signed-off-by: Kai-Hsun Chen <[email protected]>
Signed-off-by: Kai-Hsun Chen <[email protected]>
Signed-off-by: Kai-Hsun Chen <[email protected]>
Signed-off-by: Kai-Hsun Chen <[email protected]>
Co-authored-by: Rui Qiao <[email protected]> Signed-off-by: Kai-Hsun Chen <[email protected]>
Signed-off-by: Kai-Hsun Chen <[email protected]>
e371e3f
to
08b147c
Compare
gentle ping - @ruisearch42 @rkooo567 |
Generally looks good, but after #47689 , implicitly adding leaf node to MultiOutputNode will work? And do we still need this change? |
Signed-off-by: Kai-Hsun Chen <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awesome error message!
…g compilation (ray-project#47757) Leaf nodes are nodes that are not output nodes and have no downstream nodes. If a leaf node raises an exception, it will not be propagated to the driver. Therefore, this PR raises an exception if a leaf node is found during compilation. Another solution: implicitly add leaf node to MultiOutputNode Currently, the function execute can return multiple CompiledDAGRefs. The UX we want to provide is to implicitly add leaf nodes to the MultiOutputNode but not return the references of the leaf nodes. For example, a MultiOutputNode is containing 3 DAG nodes (2 normal DAG nodes + 1 leaf node). x, y = compiled_dag.execute(input_vals) # We don't return the ref for the leaf node. However, the ref for leaf node will be GC(ed) in execute, and CompiledDAGRef’s del will call get if it was never called which makes execute to be a sync instead of an async operation which is not acceptable.
…g compilation (ray-project#47757) Leaf nodes are nodes that are not output nodes and have no downstream nodes. If a leaf node raises an exception, it will not be propagated to the driver. Therefore, this PR raises an exception if a leaf node is found during compilation. Another solution: implicitly add leaf node to MultiOutputNode Currently, the function execute can return multiple CompiledDAGRefs. The UX we want to provide is to implicitly add leaf nodes to the MultiOutputNode but not return the references of the leaf nodes. For example, a MultiOutputNode is containing 3 DAG nodes (2 normal DAG nodes + 1 leaf node). x, y = compiled_dag.execute(input_vals) # We don't return the ref for the leaf node. However, the ref for leaf node will be GC(ed) in execute, and CompiledDAGRef’s del will call get if it was never called which makes execute to be a sync instead of an async operation which is not acceptable. Signed-off-by: JP-sDEV <[email protected]>
…g compilation (ray-project#47757) Leaf nodes are nodes that are not output nodes and have no downstream nodes. If a leaf node raises an exception, it will not be propagated to the driver. Therefore, this PR raises an exception if a leaf node is found during compilation. Another solution: implicitly add leaf node to MultiOutputNode Currently, the function execute can return multiple CompiledDAGRefs. The UX we want to provide is to implicitly add leaf nodes to the MultiOutputNode but not return the references of the leaf nodes. For example, a MultiOutputNode is containing 3 DAG nodes (2 normal DAG nodes + 1 leaf node). x, y = compiled_dag.execute(input_vals) # We don't return the ref for the leaf node. However, the ref for leaf node will be GC(ed) in execute, and CompiledDAGRef’s del will call get if it was never called which makes execute to be a sync instead of an async operation which is not acceptable. Signed-off-by: mohitjain2504 <[email protected]>
Why are these changes needed?
Leaf nodes are nodes that are not output nodes and have no downstream nodes. If a leaf node raises an exception, it will not be propagated to the driver. Therefore, this PR raises an exception if a leaf node is found during compilation.
Another solution: implicitly add leaf node to
MultiOutputNode
Currently, the function execute can return multiple
CompiledDAGRef
s. The UX we want to provide is to implicitly add leaf nodes to theMultiOutputNode
but not return the references of the leaf nodes. For example, aMultiOutputNode
is containing 3 DAG nodes (2 normal DAG nodes + 1 leaf node).However, the ref for leaf node will be GC(ed) in execute, and
CompiledDAGRef
’s del will call get if it was never called which makes execute to be a sync instead of an async operation which is not acceptable.Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.